CopperSpice API
1.9.2
|
The QChar32 class implements a 32-bit Unicode code point. More...
Inherits CsChar
Public Types | |
enum | Category |
enum | Decomposition |
enum | Direction |
enum | JoiningType |
enum | Script |
enum | SpecialCharacter |
enum | UnicodeVersion |
Public Methods | |
QChar32 () = default | |
QChar32 (char c) | |
QChar32 (char16_t c) | |
QChar32 (char32_t c) | |
QChar32 (int c) | |
QChar32 (SpecialCharacter c) | |
~QChar32 () = default | |
Category | category () const |
unsigned char | combiningClass () const |
QString8 | decomposition () const |
Decomposition | decompositionTag () const |
int | digitValue () const |
Direction | direction () const |
bool | hasMirrored () const |
bool | isDigit () const |
bool | isLetter () const |
bool | isLetterOrNumber () const |
bool | isLower () const |
bool | isMark () const |
bool | isNonCharacter () const |
bool | isNull () const |
bool | isNumber () const |
bool | isPrint () const |
bool | isPunct () const |
bool | isSpace () const |
bool | isSymbol () const |
bool | isTitleCase () const |
bool | isUpper () const |
JoiningType | joiningType () const |
QChar32 | mirroredChar () const |
QChar32 & | operator= (QChar32 c) & |
Script | script () const |
QString8 | toCaseFolded () const |
QString16 | toCaseFolded16 () const |
char | toLatin1 () const |
QString8 | toLower () const |
QString8 | toTitleCase () const |
QString8 | toUpper () const |
uint32_t | unicode () const |
UnicodeVersion | unicodeVersion () const |
Static Public Methods | |
static UnicodeVersion | currentUnicodeVersion () |
static QChar32 | fromLatin1 (char c) |
Related Functions | |
These are not member functions | |
bool | operator!= (QChar32 c1, QChar32 c2) |
bool | operator< (QChar32 c1, QChar32 c2) |
QDataStream & | operator<< (QDataStream &stream, QChar32 ch) |
bool | operator<= (QChar32 c1, QChar32 c2) |
bool | operator== (QChar32 c1, QChar32 c2) |
bool | operator> (QChar c1, QChar32 c2) |
bool | operator>= (QChar32 c1, QChar32 c2) |
QDataStream & | operator>> (QDataStream &stream, QChar32 &ch) |
The QChar32 class implements a 32-bit Unicode code point. A 32-bit code point is the atomic unit of text and is represented as an integer where the data type is usually uint32_t.
Code points and characters are not the same. Code points are 32-bit as defined by the Unicode consortium, this is not an arbitrary definition nor can it be changed. When working with strings you need to think in terms of code points.
A code point is a character encoding term which refers to the numerical values defined by the Unicode standard.
The "latin capital letter A" is the symbol A with a code point value of U+0041. If this symbol is represented in UTF-8 it would require one byte which is one storage unit. In UTF-16 this same symbol would require two bytes, which is also one storage unit.
The symbol called "rightwards arrow with corner downwards" and it looks like ↴. The code point value is U+21B4. If this symbol is represented in UTF-8 it would require three byte which is three storage unit. In UTF-16 this same symbol would require two bytes, which is also two storage unit.
Both of these symbols are each one code point and can be stored in a single QChar32.
For more information about strings, code points, text encodings, and internationalization refer to Unicode and Internationalization.
enum QChar32::Category |
This enum maps the Unicode character categories.
The following categories are normative in Unicode.
Constant | Value | Description |
---|---|---|
QChar32::Mark_NonSpacing | 0 | Unicode class name Mn |
QChar32::Mark_SpacingCombining | 1 | Unicode class name Mc |
QChar32::Mark_Enclosing | 2 | Unicode class name Me |
QChar32::Number_DecimalDigit | 3 | Unicode class name Nd |
QChar32::Number_Letter | 4 | Unicode class name Nl |
QChar32::Number_Other | 5 | Unicode class name No |
QChar32::Separator_Space | 6 | Unicode class name Zs |
QChar32::Separator_Line | 7 | Unicode class name Zl |
QChar32::Separator_Paragraph | 8 | Unicode class name Zp |
QChar32::Other_Control | 9 | Unicode class name Cc |
QChar32::Other_Format | 10 | Unicode class name Cf |
QChar32::Other_Surrogate | 11 | Unicode class name Cs |
QChar32::Other_PrivateUse | 12 | Unicode class name Co |
QChar32::Other_NotAssigned | 13 | Unicode class name Cn |
The following categories are informative in Unicode.
Constant | Value | Description |
---|---|---|
QChar32::Letter_Uppercase | 14 | Unicode class name Lu |
QChar32::Letter_Lowercase | 15 | Unicode class name Ll |
QChar32::Letter_Titlecase | 16 | Unicode class name Lt |
QChar32::Letter_Modifier | 17 | Unicode class name Lm |
QChar32::Letter_Other | 18 | Unicode class name Lo |
QChar32::Punctuation_Connector | 19 | Unicode class name Pc |
QChar32::Punctuation_Dash | 20 | Unicode class name Pd |
QChar32::Punctuation_Open | 21 | Unicode class name Ps |
QChar32::Punctuation_Close | 22 | Unicode class name Pe |
QChar32::Punctuation_InitialQuote | 23 | Unicode class name Pi |
QChar32::Punctuation_FinalQuote | 24 | Unicode class name Pf |
QChar32::Punctuation_Other | 25 | Unicode class name Po |
QChar32::Symbol_Math | 26 | Unicode class name Sm |
QChar32::Symbol_Currency | 27 | Unicode class name Sc |
QChar32::Symbol_Modifier | 28 | Unicode class name Sk |
QChar32::Symbol_Other | 29 | Unicode class name So |
This enum type defines the Unicode decomposition attributes.
Constant | Value |
---|---|
QChar32::NoDecomposition | 0 |
QChar32::Canonical | 1 |
QChar32::Circle | 8 |
QChar32::Compat | 16 |
QChar32::Final | 6 |
QChar32::Font | 2 |
QChar32::Fraction | 17 |
QChar32::Initial | 4 |
QChar32::Isolated | 7 |
QChar32::Medial | 5 |
QChar32::Narrow | 13 |
QChar32::NoBreak | 3 |
QChar32::Small | 14 |
QChar32::Square | 15 |
QChar32::Sub | 10 |
QChar32::Super | 9 |
QChar32::Vertical | 11 |
QChar32::Wide | 12 |
enum QChar32::Direction |
This enum type defines the Unicode direction attributes. In order to conform to C/C++ naming conventions "Dir" is prepended to the codes used in the Unicode Standard.
Constant | Value |
---|---|
QChar32::DirAL | 13 |
QChar32::DirAN | 5 |
QChar32::DirB | 7 |
QChar32::DirBN | 18 |
QChar32::DirCS | 6 |
QChar32::DirEN | 2 |
QChar32::DirES | 3 |
QChar32::DirET | 4 |
QChar32::DirFSI | 21 |
QChar32::DirL | 0 |
QChar32::DirLRE | 11 |
QChar32::DirLRI | 19 |
QChar32::DirLRO | 12 |
QChar32::DirNSM | 17 |
QChar32::DirON | 10 |
QChar32::DirPDF | 16 |
QChar32::DirPDI | 22 |
QChar32::DirR | 1 |
QChar32::DirRLE | 14 |
QChar32::DirRLI | 20 |
QChar32::DirRLO | 15 |
QChar32::DirS | 8 |
QChar32::DirWS | 9 |
enum QChar32::JoiningType |
This enum type defines the Unicode joining type attributes. In order to conform to C/C++ naming conventions "Joining_" is prepended to the codes used in the Unicode Standard.
Constant | Value |
---|---|
QChar32::Joining_None | 0 |
QChar32::Joining_Causing | 1 |
QChar32::Joining_Dual | 2 |
QChar32::Joining_Left | 3 |
QChar32::Joining_Right | 4 |
QChar32::Joining_Transparent | 5 |
enum QChar32::Script |
This enum type defines the Unicode script property values. In order to conform to C/C++ naming conventions "Script_" is prepended to the codes used in the Unicode Standard.
Refer to the Unicode Standard Annex #24 for a description of the Unicode script properties.
Constant | Value | Description |
---|---|---|
QChar32::Script_Unknown | 0 | For unassigned, private-use, noncharacter, and surrogate code points. |
QChar32::Script_Inherited | 1 | For characters that may be used with multiple scripts and that inherit their script from the preceding characters. These include nonspacing marks, enclosing marks, and zero width joiner/non-joiner characters. |
QChar32::Script_Common | 2 | For characters that may be used with multiple scripts and that do not inherit their script from the preceding characters. |
QChar32::Script_Latin | 3 | |
QChar32::Script_Greek | 4 | |
QChar32::Script_Cyrillic | 5 | |
QChar32::Script_Armenian | 6 | |
QChar32::Script_Hebrew | 7 | |
QChar32::Script_Arabic | 8 | |
QChar32::Script_Syriac | 9 | |
QChar32::Script_Thaana | 10 | |
QChar32::Script_Devanagari | 11 | |
QChar32::Script_Bengali | 12 | |
QChar32::Script_Gurmukhi | 13 | |
QChar32::Script_Gujarati | 14 | |
QChar32::Script_Oriya | 15 | |
QChar32::Script_Tamil | 16 | |
QChar32::Script_Telugu | 17 | |
QChar32::Script_Kannada | 18 | |
QChar32::Script_Malayalam | 19 | |
QChar32::Script_Sinhala | 20 | |
QChar32::Script_Thai | 21 | |
QChar32::Script_Lao | 22 | |
QChar32::Script_Tibetan | 23 | |
QChar32::Script_Myanmar | 24 | |
QChar32::Script_Georgian | 25 | |
QChar32::Script_Hangul | 26 | |
QChar32::Script_Ethiopic | 27 | |
QChar32::Script_Cherokee | 28 | |
QChar32::Script_CanadianAboriginal | 29 | |
QChar32::Script_Ogham | 30 | |
QChar32::Script_Runic | 31 | |
QChar32::Script_Khmer | 32 | |
QChar32::Script_Mongolian | 33 | |
QChar32::Script_Hiragana | 34 | |
QChar32::Script_Katakana | 35 | |
QChar32::Script_Bopomofo | 36 | |
QChar32::Script_Han | 37 | |
QChar32::Script_Yi | 38 | |
QChar32::Script_OldItalic | 39 | |
QChar32::Script_Gothic | 40 | |
QChar32::Script_Deseret | 41 | |
QChar32::Script_Tagalog | 42 | |
QChar32::Script_Hanunoo | 43 | |
QChar32::Script_Buhid | 44 | |
QChar32::Script_Tagbanwa | 45 | |
QChar32::Script_Coptic | 46 | |
QChar32::Script_Limbu | 47 | |
QChar32::Script_TaiLe | 48 | |
QChar32::Script_LinearB | 49 | |
QChar32::Script_Ugaritic | 50 | |
QChar32::Script_Shavian | 51 | |
QChar32::Script_Osmanya | 52 | |
QChar32::Script_Cypriot | 53 | |
QChar32::Script_Braille | 54 | |
QChar32::Script_Buginese | 55 | |
QChar32::Script_NewTaiLue | 56 | |
QChar32::Script_Glagolitic | 57 | |
QChar32::Script_Tifinagh | 58 | |
QChar32::Script_SylotiNagri | 59 | |
QChar32::Script_OldPersian | 60 | |
QChar32::Script_Kharoshthi | 61 | |
QChar32::Script_Balinese | 62 | |
QChar32::Script_Cuneiform | 63 | |
QChar32::Script_Phoenician | 64 | |
QChar32::Script_PhagsPa | 65 | |
QChar32::Script_Nko | 66 | |
QChar32::Script_Sundanese | 67 | |
QChar32::Script_Lepcha | 68 | |
QChar32::Script_OlChiki | 69 | |
QChar32::Script_Vai | 70 | |
QChar32::Script_Saurashtra | 71 | |
QChar32::Script_KayahLi | 72 | |
QChar32::Script_Rejang | 73 | |
QChar32::Script_Lycian | 74 | |
QChar32::Script_Carian | 75 | |
QChar32::Script_Lydian | 76 | |
QChar32::Script_Cham | 77 | |
QChar32::Script_TaiTham | 78 | |
QChar32::Script_TaiViet | 79 | |
QChar32::Script_Avestan | 80 | |
QChar32::Script_EgyptianHieroglyphs | 81 | |
QChar32::Script_Samaritan | 82 | |
QChar32::Script_Lisu | 83 | |
QChar32::Script_Bamum | 84 | |
QChar32::Script_Javanese | 85 | |
QChar32::Script_MeeteiMayek | 86 | |
QChar32::Script_ImperialAramaic | 87 | |
QChar32::Script_OldSouthArabian | 88 | |
QChar32::Script_InscriptionalParthian | 89 | |
QChar32::Script_InscriptionalPahlavi | 90 | |
QChar32::Script_OldTurkic | 91 | |
QChar32::Script_Kaithi | 92 | |
QChar32::Script_Batak | 93 | |
QChar32::Script_Brahmi | 94 | |
QChar32::Script_Mandaic | 95 | |
QChar32::Script_Chakma | 96 | |
QChar32::Script_MeroiticCursive | 97 | |
QChar32::Script_MeroiticHieroglyphs | 98 | |
QChar32::Script_Miao | 99 | |
QChar32::Script_Sharada | 100 | |
QChar32::Script_SoraSompeng | 101 | |
QChar32::Script_Takri | 102 | |
QChar32::Script_CaucasianAlbanian | 103 | |
QChar32::Script_BassaVah | 104 | |
QChar32::Script_Duployan | 105 | |
QChar32::Script_Elbasan | 106 | |
QChar32::Script_Grantha | 107 | |
QChar32::Script_PahawhHmong | 108 | |
QChar32::Script_Khojki | 109 | |
QChar32::Script_LinearA | 110 | |
QChar32::Script_Mahajani | 111 | |
QChar32::Script_Manichaean | 112 | |
QChar32::Script_MendeKikakui | 113 | |
QChar32::Script_Modi | 114 | |
QChar32::Script_Mro | 115 | |
QChar32::Script_OldNorthArabian | 116 | |
QChar32::Script_Nabataean | 117 | |
QChar32::Script_Palmyrene | 118 | |
QChar32::Script_PauCinHau | 119 | |
QChar32::Script_OldPermic | 120 | |
QChar32::Script_PsalterPahlavi | 121 | |
QChar32::Script_Siddham | 122 | |
QChar32::Script_Khudawadi | 123 | |
QChar32::Script_Tirhuta | 124 | |
QChar32::Script_WarangCiti | 125 | |
QChar32::Script_Ahom | 126 | |
QChar32::Script_AnatolianHieroglyphs | 127 | |
QChar32::Script_Hatran | 128 | |
QChar32::Script_Multani | 129 | |
QChar32::Script_OldHungarian | 130 | |
QChar32::Script_SignWriting | 131 |
This enum is provided for use with the QChar constructor which supports special characters.
Constant | Value | Description |
---|---|---|
QChar32::Null | 0x0000 | QChar32 with this value isNull() |
QChar32::Tabulation | 0x0009 | Character tabulation. |
QChar32::LineFeed | 0x000a | |
QChar32::CarriageReturn | 0x000d | |
QChar32::Space | 0x0020 | |
QChar32::Nbsp | 0x00a0 | Non-breaking space |
QChar32::SoftHyphen | 0x00ad | |
QChar32::ReplacementCharacter | 0xfffd | The character shown when a font has no glyph for a certain codepoint. A special question mark character is often used. Codecs use this codepoint when input data can not be represented in Unicode. |
QChar32::ObjectReplacementCharacter | 0xfffc | Used to represent an object such as an image when such objects can not be presented. |
QChar32::ByteOrderMark | 0xfeff | |
QChar32::ByteOrderSwapped | 0xfffe | |
QChar32::ParagraphSeparator | 0x2029 | |
QChar32::LineSeparator | 0x2028 | |
QChar32::LastValidCodePoint | 0x10ffff |
Specifies which version of the Unicode standard introduced a certain character.
Constant | Value | Description |
---|---|---|
QChar32::Unicode_1_1 | 1 | Version 1.1 |
QChar32::Unicode_2_0 | 2 | Version 2.0 |
QChar32::Unicode_2_1_2 | 3 | Version 2.1.2 |
QChar32::Unicode_3_0 | 4 | Version 3.0 |
QChar32::Unicode_3_1 | 5 | Version 3.1 |
QChar32::Unicode_3_2 | 6 | Version 3.2 |
QChar32::Unicode_4_0 | 7 | Version 4.0 |
QChar32::Unicode_4_1 | 8 | Version 4.1 |
QChar32::Unicode_5_0 | 9 | Version 5.0 |
QChar32::Unicode_5_1 | 10 | Version 5.1 |
QChar32::Unicode_5_2 | 11 | Version 5.2 |
QChar32::Unicode_6_0 | 12 | Version 6.0 |
QChar32::Unicode_6_1 | 13 | Version 6.1 |
QChar32::Unicode_6_2 | 14 | Version 6.2 |
QChar32::Unicode_6_3 | 15 | Version 6.3 |
QChar32::Unicode_7_0 | 16 | Version 7.0 |
QChar32::Unicode_8_0 | 17 | Version 8.0 |
QChar32::Unicode_Unassigned | 0 | Value is not assigned to any character in version 8.0 of Unicode. |
|
default |
Constructs a null QChar32.
|
inlineexplicit |
Constructs a QChar32 corresponding to the Latin-1 character c.
|
inline |
Constructs a QChar32 for the character with Unicode code point c.
|
inline |
Constructs a QChar32 for the character with Unicode code point c.
|
inline |
Constructs a QChar32 for the character with Unicode code point c.
|
inline |
Constructs a QChar32 for the predefined character value c.
|
default |
Destroys a QChar32.
Category QChar32::category | ( | ) | const |
Returns the category specified by the Unicode standard.
unsigned char QChar32::combiningClass | ( | ) | const |
Returns the combining class for the character as defined in the Unicode standard. This is mainly useful as a positioning hint for marks attached to a base character.
The text rendering engine uses this information to correctly position non-spacing marks around a base character.
|
static |
Returns the most recent supported Unicode version.
QString8 QChar32::decomposition | ( | ) | const |
Decomposes a character into a base character followed by one or more combining characters. Returns an empty string if no decomposition exists.
Decomposition QChar32::decompositionTag | ( | ) | const |
Returns the tag defining the composition of the Unicode character. Returns QChar32::Single if no decomposition exists.
int QChar32::digitValue | ( | ) | const |
Returns the numeric value of the digit as specified by the Unicode standard. Return -1 if the character is not a digit.
Direction QChar32::direction | ( | ) | const |
Returns the direction specified by the Unicode standard.
|
inlinestatic |
Converts the Latin-1 character c to its equivalent QChar32. This is mainly useful for non-internationalized software.
bool QChar32::hasMirrored | ( | ) | const |
Returns true if the character should be reversed if the text direction is reversed, otherwise returns false. Equivalent to calling the following code.
|
inline |
Returns true if the character is a decimal digit (Number_DecimalDigit), otherwise returns false.
bool QChar32::isLetter | ( | ) | const |
Returns true if the character is a letter (Letter_* categories), otherwise returns false.
bool QChar32::isLetterOrNumber | ( | ) | const |
Returns true if the character is a letter or number (Letter_* or Number_* categories), otherwise returns false.
|
inline |
Returns true if the character is a lowercase letter, i.e. category() is Letter_Lowercase.
bool QChar32::isMark | ( | ) | const |
Returns true if the character is a mark (Mark_* categories), otherwise returns false.
Refer to QChar32::Category for more information regarding marks.
bool QChar32::isNonCharacter | ( | ) | const |
Returns true if the QChar is a non-character; false otherwise.
Unicode has a certain number of code points that are classified as "non-characters:" which are used for internal purposes in applications but cannot be used for text interchange. Those are the last two entries in each Unicode Plane ([0xfffe..0xffff], [0x1fffe..0x1ffff], etc.) as well as the entries in range [0xfdd0..0xfdef].
|
inline |
Returns true if the character is the Unicode character 0x0000 ('\0'), otherwise returns false.
bool QChar32::isNumber | ( | ) | const |
Returns true if the character is a number (Number_* categories, not just 0-9), otherwise returns false.
bool QChar32::isPrint | ( | ) | const |
Returns true if the character is a printable character, otherwise returns false. This is any character not of category Cc or Cn. This methods gives no indication of whether the character is available in a particular font.
bool QChar32::isPunct | ( | ) | const |
Returns true if the character is a punctuation mark (Punctuation_* categories), otherwise returns false.
bool QChar32::isSpace | ( | ) | const |
Returns true if the character is a separator character, otherwise returns false.
bool QChar32::isSymbol | ( | ) | const |
Returns true if the character is a symbol (Symbol_* categories), otherwise returns false.
|
inline |
Returns true if the character is a titlecase letter, i.e. category() is Letter_Titlecase.
|
inline |
Returns true if the character is an uppercase letter, i.e. category() is Letter_Uppercase.
JoiningType QChar32::joiningType | ( | ) | const |
Returns information about the joining type attributes of the character. This information is needed for certain languages such as Arabic or Syriac.
QChar32 QChar32::mirroredChar | ( | ) | const |
Returns the mirrored character if this character is a mirrored character in the Unicode standard, otherwise returns the character itself.
|
inline |
Copy assigns from c and returns a reference to this object.
Script QChar32::script | ( | ) | const |
Returns the Unicode script property value for this Unicode character.
QString8 QChar32::toCaseFolded | ( | ) | const |
Returns the case folded equivalent of this Unicode character. For most Unicode characters this is the same as toLowerCase().
QString16 QChar32::toCaseFolded16 | ( | ) | const |
Returns the case folded equivalent of the Unicode character. For most Unicode characters this is the same as toLower().
|
inline |
Returns the Latin-1 character equivalent to the QChar32 or a null character. This is mainly useful for non-internationalized software.
QString8 QChar32::toLower | ( | ) | const |
Returns the lowercase equivalent if the character is uppercase or titlecase, otherwise returns the character itself.
QString8 QChar32::toTitleCase | ( | ) | const |
Returns the title case equivalent if the character is lowercase or uppercase, otherwise returns the character itself.
QString8 QChar32::toUpper | ( | ) | const |
Returns the uppercase equivalent if the character is lowercase or titlecase, otherwise returns the character itself.
|
inline |
Returns the numeric Unicode value stored in the current QChar32.
UnicodeVersion QChar32::unicodeVersion | ( | ) | const |
Returns the Unicode version which introduced this character.
|
related |
Returns true if c1 and c2 are not the same Unicode character, otherwise returns false.
|
related |
Returns true if the numeric Unicode value of c1 is less than that of c2, otherwise returns false.
|
related |
Writes the given ch to the stream. Returns a reference to the stream.
Refer to Serializing Data Types for additional information.
|
related |
Returns true if the numeric Unicode value of c1 is less than or equal to that of c2, otherwise returns false.
|
related |
Returns true if c1 and c2 are the same Unicode character, otherwise returns false.
|
related |
Returns true if the numeric Unicode value of c1 is greater than that of c2 otherwise returns false.
|
related |
Returns true if the numeric Unicode value of c1 is greater than or equal to that of c2, otherwise returns false.
|
related |
Reads from the stream into the given ch. Returns a reference to the stream.
Refer to Serializing Data Types for additional information.