Search code examples
qtunicodeqstring

Length of Utf-32 character in Qt


I'm using Qt5. I have a QString holding one character U"\x1D4CC" (𝓌) that is longer than 16 bits. Even though this is only one character, Qt returns that size of this string is 2. Is there any way to display how many real characters a QString has making the assumption that there can be 32-characters?


Solution

  • Unicode characters with code values above 65535 are stored using surrogate pairs, i.e., two consecutive QChars. QString::length return the number of QChar in this string, which may differ from number of graphemes(real characters).

    To calculate number of graphemes, you can use QTextBoundaryFinder class.

    QString str = "𝓌";
    QTextBoundaryFinder finder(QTextBoundaryFinder::Grapheme, str);
    int count = 0;
    while (finder.toNextBoundary() != -1)
        ++count;
    qDebug() << count;
    

    Or you can convert your string to UCS-4/UTF-32 representation and calculate number of 32-bit characters.

    QVector<uint> ucs4 = str.toUcs4();
    qDebug() << ucs4.size();