Search code examples
stringqtdiacriticsqstring

How to remove accents / diacritic marks from a string in Qt?


How to remove diacritic marks from a string in Qt. For example, this:

QString test = QString::fromUtf8("éçàÖœ");
qDebug() << StringUtil::removeAccents(test);

should output:

ecaOoe

Solution

  • There is not straighforward, built-in solution in Qt. A simple solution, which should work in most cases, is to loop through the string and replace each character by their equivalent:

    QString StringUtil::diacriticLetters_;
    QStringList StringUtil::noDiacriticLetters_;
    
    QString StringUtil::removeAccents(QString s) {
        if (diacriticLetters_.isEmpty()) {
            diacriticLetters_ = QString::fromUtf8("ŠŒŽšœžŸ¥µÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýÿ");
            noDiacriticLetters_ << "S"<<"OE"<<"Z"<<"s"<<"oe"<<"z"<<"Y"<<"Y"<<"u"<<"A"<<"A"<<"A"<<"A"<<"A"<<"A"<<"AE"<<"C"<<"E"<<"E"<<"E"<<"E"<<"I"<<"I"<<"I"<<"I"<<"D"<<"N"<<"O"<<"O"<<"O"<<"O"<<"O"<<"O"<<"U"<<"U"<<"U"<<"U"<<"Y"<<"s"<<"a"<<"a"<<"a"<<"a"<<"a"<<"a"<<"ae"<<"c"<<"e"<<"e"<<"e"<<"e"<<"i"<<"i"<<"i"<<"i"<<"o"<<"n"<<"o"<<"o"<<"o"<<"o"<<"o"<<"o"<<"u"<<"u"<<"u"<<"u"<<"y"<<"y";
        }
    
        QString output = "";
        for (int i = 0; i < s.length(); i++) {
            QChar c = s[i];
            int dIndex = diacriticLetters_.indexOf(c);
            if (dIndex < 0) {
                output.append(c);
            } else {
                QString replacement = noDiacriticLetters_[dIndex];
                output.append(replacement);
            }
        }
    
        return output;
    }
    

    Note that noDiacriticLetters_ needs to be a QStringList since some characters with diacritic marks can match to two single characters. For example œ => oe