Search code examples
c++boostencodinglocaleiostream

how to detect and handle unsupported locales in algorithms?


I have a function with the following signature:

 template <typename Container>
 void write_cards_as_xml(const Container& cards, std::ostream& os);

Internally it calls:

boost::property_tree::ptree root;
...
boost::property_tree::write_xml(os, root);

The write_xml function does not know anything about encodings. By default it assumes UTF-8 But does not do any conversions. It's up to the locale of os. I'm not sure how to handle unsupported non-UTF-8 locales. Can I detect if it is not UTF-8? Should I throw if not? Should I replace the locale temporarily to my prefered encoding? I' m using boost locale.


Solution

  • The Standard library has no platform independent way to detect if a locale is UTF-8. There's only a name method which returns a platform dependent name. Even if it is a POSIX name there's no guarantee that the encoding is part of the locale's name.

    Boost.Locale offers an additional facet called boost::locale::info holding detailed information about the current locale. https://www.boost.org/doc/libs/1_70_0/libs/locale/doc/html/locale_information.html

    You can obtain the info like this:

    std::use_facet<boost::locale::info>(some_locale).utf8()
    

    If there is no info facet std::use_face throws std::bad_cast. In this case it's not a Boost locale and you're out of luck. Throwing is a reasonable behavior in this case. You could catch the bad_cast and throw a more informative exception instead. If there's an info facet you can inspect the return value of utf8(). If it returns false the current locale is not compatible and you should throw, too. Otherwise your algorithm can run without problems.