Search code examples
c++boostlocale

Using boost::iequals with std::u16string


I'm trying to do a case-insensitive string comparsion on two std::u16string instances with boost. Based on my searching, I need to generate a locale, which I'm doing.

#include <boost/algorithm/string.hpp>
#include <boost/locale.hpp>

#include <locale>
#include <iostream>

int main() {
    // Create the strings
    std::u16string str1 = boost::locale::conv::utf_to_utf<char16_t>("unicode");
    std::u16string str2 = boost::locale::conv::utf_to_utf<char16_t>("UNICODE");

    // Create the locale
    boost::locale::generator gen;
    std::locale loc = gen("");

    // Doesn't matter if I do this or not
    //std::locale::global(loc);

    // Try to compare
    if (boost::iequals(str1, str2, loc)) {
        std::cout << "EQUAL\n";
    } else {
        std::cout << "!EQUAL\n";
    }

    return 0;
}

This results in an std::bad_cast exception:

terminate called after throwing an instance of 'std::bad_cast'
  what():  std::bad_cast

What am I doing wrong?


Solution

  • std::u16string uses char16_t (as you know).

    boost::iequals uses std::toupper internally to compare two strings.

    std::toupper requires facet support in std::ctype<cT>, where ct = char16_t in our case. As explained in this answer, this support is not required by the standard and therefore lacking in most implementations.

    The facet std::ctype needs to be specialized and put into the used facet to support widening, narrowing, and classification of the character type. There is no ready specialization for char16_t or char32_t.

    So you are doing nothing wrong, the support just isn't there. If you really need 16 bit unicode string support, I'd recommend looking at a third-party library, such as Qt, where the class QString uses 16-bit chars by default.