CopperSpice API  1.9.2
QChar32 Class Reference

The QChar32 class implements a 32-bit Unicode code point. More...

Inherits CsChar

Public Types

enum  Category
 
enum  Decomposition
 
enum  Direction
 
enum  JoiningType
 
enum  Script
 
enum  SpecialCharacter
 
enum  UnicodeVersion
 

Public Methods

 QChar32 () = default
 
 QChar32 (char c)
 
 QChar32 (char16_t c)
 
 QChar32 (char32_t c)
 
 QChar32 (int c)
 
 QChar32 (SpecialCharacter c)
 
 ~QChar32 () = default
 
Category category () const
 
unsigned char combiningClass () const
 
QString8 decomposition () const
 
Decomposition decompositionTag () const
 
int digitValue () const
 
Direction direction () const
 
bool hasMirrored () const
 
bool isDigit () const
 
bool isLetter () const
 
bool isLetterOrNumber () const
 
bool isLower () const
 
bool isMark () const
 
bool isNonCharacter () const
 
bool isNull () const
 
bool isNumber () const
 
bool isPrint () const
 
bool isPunct () const
 
bool isSpace () const
 
bool isSymbol () const
 
bool isTitleCase () const
 
bool isUpper () const
 
JoiningType joiningType () const
 
QChar32 mirroredChar () const
 
QChar32 & operator= (QChar32 c) &
 
Script script () const
 
QString8 toCaseFolded () const
 
QString16 toCaseFolded16 () const
 
char toLatin1 () const
 
QString8 toLower () const
 
QString8 toTitleCase () const
 
QString8 toUpper () const
 
uint32_t unicode () const
 
UnicodeVersion unicodeVersion () const
 

Static Public Methods

static UnicodeVersion currentUnicodeVersion ()
 
static QChar32 fromLatin1 (char c)
 

Related Functions

These are not member functions

bool operator!= (QChar32 c1, QChar32 c2)
 
bool operator< (QChar32 c1, QChar32 c2)
 
QDataStreamoperator<< (QDataStream &stream, QChar32 ch)
 
bool operator<= (QChar32 c1, QChar32 c2)
 
bool operator== (QChar32 c1, QChar32 c2)
 
bool operator> (QChar c1, QChar32 c2)
 
bool operator>= (QChar32 c1, QChar32 c2)
 
QDataStreamoperator>> (QDataStream &stream, QChar32 &ch)
 

Detailed Description

The QChar32 class implements a 32-bit Unicode code point. A 32-bit code point is the atomic unit of text and is represented as an integer where the data type is usually uint32_t.

Code points and characters are not the same. Code points are 32-bit as defined by the Unicode consortium, this is not an arbitrary definition nor can it be changed. When working with strings you need to think in terms of code points.

A code point is a character encoding term which refers to the numerical values defined by the Unicode standard.

Unicode Code Points

The "latin capital letter A" is the symbol A with a code point value of U+0041. If this symbol is represented in UTF-8 it would require one byte which is one storage unit. In UTF-16 this same symbol would require two bytes, which is also one storage unit.

The symbol called "rightwards arrow with corner downwards" and it looks like . The code point value is U+21B4. If this symbol is represented in UTF-8 it would require three byte which is three storage unit. In UTF-16 this same symbol would require two bytes, which is also two storage unit.

Both of these symbols are each one code point and can be stored in a single QChar32.

For more information about strings, code points, text encodings, and internationalization refer to Unicode and Internationalization.

See also
QString8

Member Enumeration Documentation

This enum maps the Unicode character categories.

The following categories are normative in Unicode.

ConstantValueDescription
QChar32::Mark_NonSpacing0Unicode class name Mn
QChar32::Mark_SpacingCombining1Unicode class name Mc
QChar32::Mark_Enclosing2Unicode class name Me
QChar32::Number_DecimalDigit3Unicode class name Nd
QChar32::Number_Letter4Unicode class name Nl
QChar32::Number_Other5Unicode class name No
QChar32::Separator_Space6Unicode class name Zs
QChar32::Separator_Line7Unicode class name Zl
QChar32::Separator_Paragraph8Unicode class name Zp
QChar32::Other_Control9Unicode class name Cc
QChar32::Other_Format10Unicode class name Cf
QChar32::Other_Surrogate11Unicode class name Cs
QChar32::Other_PrivateUse12Unicode class name Co
QChar32::Other_NotAssigned13Unicode class name Cn

The following categories are informative in Unicode.

ConstantValueDescription
QChar32::Letter_Uppercase14Unicode class name Lu
QChar32::Letter_Lowercase15Unicode class name Ll
QChar32::Letter_Titlecase16Unicode class name Lt
QChar32::Letter_Modifier17Unicode class name Lm
QChar32::Letter_Other18Unicode class name Lo
QChar32::Punctuation_Connector19Unicode class name Pc
QChar32::Punctuation_Dash20Unicode class name Pd
QChar32::Punctuation_Open21Unicode class name Ps
QChar32::Punctuation_Close22Unicode class name Pe
QChar32::Punctuation_InitialQuote23Unicode class name Pi
QChar32::Punctuation_FinalQuote24Unicode class name Pf
QChar32::Punctuation_Other25Unicode class name Po
QChar32::Symbol_Math26Unicode class name Sm
QChar32::Symbol_Currency27Unicode class name Sc
QChar32::Symbol_Modifier28Unicode class name Sk
QChar32::Symbol_Other29Unicode class name So
See also
category()

This enum type defines the Unicode decomposition attributes.

ConstantValue
QChar32::NoDecomposition0
QChar32::Canonical1
QChar32::Circle8
QChar32::Compat16
QChar32::Final6
QChar32::Font2
QChar32::Fraction17
QChar32::Initial4
QChar32::Isolated7
QChar32::Medial5
QChar32::Narrow13
QChar32::NoBreak3
QChar32::Small14
QChar32::Square15
QChar32::Sub10
QChar32::Super9
QChar32::Vertical11
QChar32::Wide12
See also
decomposition()

This enum type defines the Unicode direction attributes. In order to conform to C/C++ naming conventions "Dir" is prepended to the codes used in the Unicode Standard.

ConstantValue
QChar32::DirAL 13
QChar32::DirAN 5
QChar32::DirB 7
QChar32::DirBN 18
QChar32::DirCS 6
QChar32::DirEN 2
QChar32::DirES 3
QChar32::DirET 4
QChar32::DirFSI 21
QChar32::DirL 0
QChar32::DirLRE 11
QChar32::DirLRI 19
QChar32::DirLRO 12
QChar32::DirNSM 17
QChar32::DirON 10
QChar32::DirPDF 16
QChar32::DirPDI 22
QChar32::DirR 1
QChar32::DirRLE 14
QChar32::DirRLI 20
QChar32::DirRLO 15
QChar32::DirS 8
QChar32::DirWS 9
See also
direction()

This enum type defines the Unicode joining type attributes. In order to conform to C/C++ naming conventions "Joining_" is prepended to the codes used in the Unicode Standard.

ConstantValue
QChar32::Joining_None 0
QChar32::Joining_Causing 1
QChar32::Joining_Dual 2
QChar32::Joining_Left 3
QChar32::Joining_Right 4
QChar32::Joining_Transparent 5
See also
joiningType()

This enum type defines the Unicode script property values. In order to conform to C/C++ naming conventions "Script_" is prepended to the codes used in the Unicode Standard.

Refer to the Unicode Standard Annex #24 for a description of the Unicode script properties.

ConstantValueDescription
QChar32::Script_Unknown0 For unassigned, private-use, noncharacter, and surrogate code points.
QChar32::Script_Inherited1 For characters that may be used with multiple scripts and that inherit their script from the preceding characters. These include nonspacing marks, enclosing marks, and zero width joiner/non-joiner characters.
QChar32::Script_Common2 For characters that may be used with multiple scripts and that do not inherit their script from the preceding characters.
QChar32::Script_Latin 3
QChar32::Script_Greek 4
QChar32::Script_Cyrillic 5
QChar32::Script_Armenian 6
QChar32::Script_Hebrew 7
QChar32::Script_Arabic 8
QChar32::Script_Syriac 9
QChar32::Script_Thaana 10
QChar32::Script_Devanagari 11
QChar32::Script_Bengali 12
QChar32::Script_Gurmukhi 13
QChar32::Script_Gujarati 14
QChar32::Script_Oriya 15
QChar32::Script_Tamil 16
QChar32::Script_Telugu 17
QChar32::Script_Kannada 18
QChar32::Script_Malayalam 19
QChar32::Script_Sinhala 20
QChar32::Script_Thai 21
QChar32::Script_Lao 22
QChar32::Script_Tibetan 23
QChar32::Script_Myanmar 24
QChar32::Script_Georgian 25
QChar32::Script_Hangul 26
QChar32::Script_Ethiopic 27
QChar32::Script_Cherokee 28
QChar32::Script_CanadianAboriginal 29
QChar32::Script_Ogham 30
QChar32::Script_Runic 31
QChar32::Script_Khmer 32
QChar32::Script_Mongolian 33
QChar32::Script_Hiragana 34
QChar32::Script_Katakana 35
QChar32::Script_Bopomofo 36
QChar32::Script_Han 37
QChar32::Script_Yi 38
QChar32::Script_OldItalic 39
QChar32::Script_Gothic 40
QChar32::Script_Deseret 41
QChar32::Script_Tagalog 42
QChar32::Script_Hanunoo 43
QChar32::Script_Buhid 44
QChar32::Script_Tagbanwa 45
QChar32::Script_Coptic 46
QChar32::Script_Limbu 47
QChar32::Script_TaiLe 48
QChar32::Script_LinearB 49
QChar32::Script_Ugaritic 50
QChar32::Script_Shavian 51
QChar32::Script_Osmanya 52
QChar32::Script_Cypriot 53
QChar32::Script_Braille 54
QChar32::Script_Buginese 55
QChar32::Script_NewTaiLue 56
QChar32::Script_Glagolitic 57
QChar32::Script_Tifinagh 58
QChar32::Script_SylotiNagri 59
QChar32::Script_OldPersian 60
QChar32::Script_Kharoshthi 61
QChar32::Script_Balinese 62
QChar32::Script_Cuneiform 63
QChar32::Script_Phoenician 64
QChar32::Script_PhagsPa 65
QChar32::Script_Nko 66
QChar32::Script_Sundanese 67
QChar32::Script_Lepcha 68
QChar32::Script_OlChiki 69
QChar32::Script_Vai 70
QChar32::Script_Saurashtra 71
QChar32::Script_KayahLi 72
QChar32::Script_Rejang 73
QChar32::Script_Lycian 74
QChar32::Script_Carian 75
QChar32::Script_Lydian 76
QChar32::Script_Cham 77
QChar32::Script_TaiTham 78
QChar32::Script_TaiViet 79
QChar32::Script_Avestan 80
QChar32::Script_EgyptianHieroglyphs 81
QChar32::Script_Samaritan 82
QChar32::Script_Lisu 83
QChar32::Script_Bamum 84
QChar32::Script_Javanese 85
QChar32::Script_MeeteiMayek 86
QChar32::Script_ImperialAramaic 87
QChar32::Script_OldSouthArabian 88
QChar32::Script_InscriptionalParthian 89
QChar32::Script_InscriptionalPahlavi 90
QChar32::Script_OldTurkic 91
QChar32::Script_Kaithi 92
QChar32::Script_Batak 93
QChar32::Script_Brahmi 94
QChar32::Script_Mandaic 95
QChar32::Script_Chakma 96
QChar32::Script_MeroiticCursive 97
QChar32::Script_MeroiticHieroglyphs 98
QChar32::Script_Miao 99
QChar32::Script_Sharada 100
QChar32::Script_SoraSompeng 101
QChar32::Script_Takri 102
QChar32::Script_CaucasianAlbanian 103
QChar32::Script_BassaVah 104
QChar32::Script_Duployan 105
QChar32::Script_Elbasan 106
QChar32::Script_Grantha 107
QChar32::Script_PahawhHmong 108
QChar32::Script_Khojki 109
QChar32::Script_LinearA 110
QChar32::Script_Mahajani 111
QChar32::Script_Manichaean 112
QChar32::Script_MendeKikakui 113
QChar32::Script_Modi 114
QChar32::Script_Mro 115
QChar32::Script_OldNorthArabian 116
QChar32::Script_Nabataean 117
QChar32::Script_Palmyrene 118
QChar32::Script_PauCinHau 119
QChar32::Script_OldPermic 120
QChar32::Script_PsalterPahlavi 121
QChar32::Script_Siddham 122
QChar32::Script_Khudawadi 123
QChar32::Script_Tirhuta 124
QChar32::Script_WarangCiti 125
QChar32::Script_Ahom 126
QChar32::Script_AnatolianHieroglyphs 127
QChar32::Script_Hatran 128
QChar32::Script_Multani 129
QChar32::Script_OldHungarian 130
QChar32::Script_SignWriting 131

This enum is provided for use with the QChar constructor which supports special characters.

ConstantValueDescription
QChar32::Null0x0000 QChar32 with this value isNull()
QChar32::Tabulation0x0009 Character tabulation.
QChar32::LineFeed0x000a  
QChar32::CarriageReturn0x000d  
QChar32::Space0x0020  
QChar32::Nbsp0x00a0 Non-breaking space
QChar32::SoftHyphen0x00ad  
QChar32::ReplacementCharacter 0xfffdThe character shown when a font has no glyph for a certain codepoint. A special question mark character is often used. Codecs use this codepoint when input data can not be represented in Unicode.
QChar32::ObjectReplacementCharacter 0xfffcUsed to represent an object such as an image when such objects can not be presented.
QChar32::ByteOrderMark0xfeff  
QChar32::ByteOrderSwapped0xfffe  
QChar32::ParagraphSeparator0x2029  
QChar32::LineSeparator0x2028  
QChar32::LastValidCodePoint0x10ffff  

Specifies which version of the Unicode standard introduced a certain character.

ConstantValueDescription
QChar32::Unicode_1_11 Version 1.1
QChar32::Unicode_2_02 Version 2.0
QChar32::Unicode_2_1_23 Version 2.1.2
QChar32::Unicode_3_04 Version 3.0
QChar32::Unicode_3_15 Version 3.1
QChar32::Unicode_3_26 Version 3.2
QChar32::Unicode_4_07 Version 4.0
QChar32::Unicode_4_18 Version 4.1
QChar32::Unicode_5_09 Version 5.0
QChar32::Unicode_5_110 Version 5.1
QChar32::Unicode_5_211 Version 5.2
QChar32::Unicode_6_012 Version 6.0
QChar32::Unicode_6_113 Version 6.1
QChar32::Unicode_6_214 Version 6.2
QChar32::Unicode_6_315 Version 6.3
QChar32::Unicode_7_016 Version 7.0
QChar32::Unicode_8_017 Version 8.0
QChar32::Unicode_Unassigned0Value is not assigned to any character in version 8.0 of Unicode.
See also
unicodeVersion()

Constructor & Destructor Documentation

QChar32::QChar32 ( )
default

Constructs a null QChar32.

See also
isNull()
QChar32::QChar32 ( char  c)
inlineexplicit

Constructs a QChar32 corresponding to the Latin-1 character c.

QChar32::QChar32 ( char32_t  c)
inline

Constructs a QChar32 for the character with Unicode code point c.

QChar32::QChar32 ( char16_t  c)
inline

Constructs a QChar32 for the character with Unicode code point c.

QChar32::QChar32 ( int  c)
inline

Constructs a QChar32 for the character with Unicode code point c.

QChar32::QChar32 ( SpecialCharacter  c)
inline

Constructs a QChar32 for the predefined character value c.

QChar32::~QChar32 ( )
default

Destroys a QChar32.

Method Documentation

Category QChar32::category ( ) const

Returns the category specified by the Unicode standard.

unsigned char QChar32::combiningClass ( ) const

Returns the combining class for the character as defined in the Unicode standard. This is mainly useful as a positioning hint for marks attached to a base character.

The text rendering engine uses this information to correctly position non-spacing marks around a base character.

UnicodeVersion QChar32::currentUnicodeVersion ( )
static

Returns the most recent supported Unicode version.

QString8 QChar32::decomposition ( ) const

Decomposes a character into a base character followed by one or more combining characters. Returns an empty string if no decomposition exists.

Decomposition QChar32::decompositionTag ( ) const

Returns the tag defining the composition of the Unicode character. Returns QChar32::Single if no decomposition exists.

int QChar32::digitValue ( ) const

Returns the numeric value of the digit as specified by the Unicode standard. Return -1 if the character is not a digit.

Direction QChar32::direction ( ) const

Returns the direction specified by the Unicode standard.

QChar32 QChar32::fromLatin1 ( char  c)
inlinestatic

Converts the Latin-1 character c to its equivalent QChar32. This is mainly useful for non-internationalized software.

See also
unicode()
bool QChar32::hasMirrored ( ) const

Returns true if the character should be reversed if the text direction is reversed, otherwise returns false. Equivalent to calling the following code.

ch.mirroredChar() != ch;
See also
mirroredChar()
bool QChar32::isDigit ( ) const
inline

Returns true if the character is a decimal digit (Number_DecimalDigit), otherwise returns false.

bool QChar32::isLetter ( ) const

Returns true if the character is a letter (Letter_* categories), otherwise returns false.

bool QChar32::isLetterOrNumber ( ) const

Returns true if the character is a letter or number (Letter_* or Number_* categories), otherwise returns false.

bool QChar32::isLower ( ) const
inline

Returns true if the character is a lowercase letter, i.e. category() is Letter_Lowercase.

See also
isUpper(), toLower(), toUpper()
bool QChar32::isMark ( ) const

Returns true if the character is a mark (Mark_* categories), otherwise returns false.

Refer to QChar32::Category for more information regarding marks.

bool QChar32::isNonCharacter ( ) const

Returns true if the QChar is a non-character; false otherwise.

Unicode has a certain number of code points that are classified as "non-characters:" which are used for internal purposes in applications but cannot be used for text interchange. Those are the last two entries in each Unicode Plane ([0xfffe..0xffff], [0x1fffe..0x1ffff], etc.) as well as the entries in range [0xfdd0..0xfdef].

bool QChar32::isNull ( ) const
inline

Returns true if the character is the Unicode character 0x0000 ('\0'), otherwise returns false.

bool QChar32::isNumber ( ) const

Returns true if the character is a number (Number_* categories, not just 0-9), otherwise returns false.

See also
isDigit()
bool QChar32::isPrint ( ) const

Returns true if the character is a printable character, otherwise returns false. This is any character not of category Cc or Cn. This methods gives no indication of whether the character is available in a particular font.

bool QChar32::isPunct ( ) const

Returns true if the character is a punctuation mark (Punctuation_* categories), otherwise returns false.

bool QChar32::isSpace ( ) const

Returns true if the character is a separator character, otherwise returns false.

bool QChar32::isSymbol ( ) const

Returns true if the character is a symbol (Symbol_* categories), otherwise returns false.

bool QChar32::isTitleCase ( ) const
inline

Returns true if the character is a titlecase letter, i.e. category() is Letter_Titlecase.

See also
isLower(), toUpper(), toLower(), toTitleCase()
bool QChar32::isUpper ( ) const
inline

Returns true if the character is an uppercase letter, i.e. category() is Letter_Uppercase.

See also
isLower(), toUpper(), toLower()
JoiningType QChar32::joiningType ( ) const

Returns information about the joining type attributes of the character. This information is needed for certain languages such as Arabic or Syriac.

QChar32 QChar32::mirroredChar ( ) const

Returns the mirrored character if this character is a mirrored character in the Unicode standard, otherwise returns the character itself.

See also
hasMirrored()
QChar32 & QChar32::operator= ( QChar32  c) &
inline

Copy assigns from c and returns a reference to this object.

Script QChar32::script ( ) const

Returns the Unicode script property value for this Unicode character.

QString8 QChar32::toCaseFolded ( ) const

Returns the case folded equivalent of this Unicode character. For most Unicode characters this is the same as toLowerCase().

QString16 QChar32::toCaseFolded16 ( ) const

Returns the case folded equivalent of the Unicode character. For most Unicode characters this is the same as toLower().

char QChar32::toLatin1 ( ) const
inline

Returns the Latin-1 character equivalent to the QChar32 or a null character. This is mainly useful for non-internationalized software.

See also
unicode()
QString8 QChar32::toLower ( ) const

Returns the lowercase equivalent if the character is uppercase or titlecase, otherwise returns the character itself.

QString8 QChar32::toTitleCase ( ) const

Returns the title case equivalent if the character is lowercase or uppercase, otherwise returns the character itself.

QString8 QChar32::toUpper ( ) const

Returns the uppercase equivalent if the character is lowercase or titlecase, otherwise returns the character itself.

uint32_t QChar32::unicode ( ) const
inline

Returns the numeric Unicode value stored in the current QChar32.

UnicodeVersion QChar32::unicodeVersion ( ) const

Returns the Unicode version which introduced this character.

Friends And Related Function Documentation

bool operator!= ( QChar32  c1,
QChar32  c2 
)
related

Returns true if c1 and c2 are not the same Unicode character, otherwise returns false.

bool operator< ( QChar32  c1,
QChar32  c2 
)
related

Returns true if the numeric Unicode value of c1 is less than that of c2, otherwise returns false.

QDataStream & operator<< ( QDataStream stream,
QChar32  ch 
)
related

Writes the given ch to the stream. Returns a reference to the stream.

Refer to Serializing Data Types for additional information.

bool operator<= ( QChar32  c1,
QChar32  c2 
)
related

Returns true if the numeric Unicode value of c1 is less than or equal to that of c2, otherwise returns false.

bool operator== ( QChar32  c1,
QChar32  c2 
)
related

Returns true if c1 and c2 are the same Unicode character, otherwise returns false.

bool operator> ( QChar  c1,
QChar32  c2 
)
related

Returns true if the numeric Unicode value of c1 is greater than that of c2 otherwise returns false.

bool operator>= ( QChar32  c1,
QChar32  c2 
)
related

Returns true if the numeric Unicode value of c1 is greater than or equal to that of c2, otherwise returns false.

QDataStream & operator>> ( QDataStream stream,
QChar32 &  ch 
)
related

Reads from the stream into the given ch. Returns a reference to the stream.

Refer to Serializing Data Types for additional information.