Search code examples
c++unicodeutf-8x11xcb

UTF8 char array to std::wstring


I just trying to get x11 window title, and store it in std::wstring. I use such command to get the title

auto req_title = xcb_get_property(conn, 0, window, XCB_ATOM_WM_NAME, XCB_GET_PROPERTY_TYPE_ANY, 0, 100);
auto res_title = xcb_get_property_reply(conn, req_title, nullptr);

After that, I can get title stored in char array. How can I convert this array to wstring?


Solution

  • Current solution

    You can use std::wstring_convert to convert a string to or from wstring, using a codecvt to specify the conversion to be performed.

    Example of use:

    string so=u8"Jérôme Ângle"; 
    wstring st; 
    wstring_convert<std::codecvt_utf8<wchar_t>,wchar_t> converter;
    st = converter.from_bytes(so);
    

    If you have a c-string (array of char), the overloads of from_bytes() will do exactly what you want:

    char p[]=u8"Jérôme Ângle";
    wstring ws = converter.from_bytes(p);
    

    Online demo

    Is it sustainable ?

    As pointed out in the comments, C++17 has deprecated codecvt and the wstring_convert utility:

    These features are hard to use correctly, and there are doubts whether they are even specified correctly. Users should use dedicated text-processing libraries instead.

    In addition, a wstring is based on wchar_t which has a very different encoding on linux systems and on windows systems.

    So the first question would be to ask why a wstring is needed at all, and why not just keep utf-8 everywhere.

    Depending on the reasons, you may consider to use:

    • ICU and its UnicodeString for a full, in-depth, unicode support
    • boost.locale an its to_utf or utf_to_utf, for common unicode related tasks.
    • utf8-cpp for working with utf8 strings the unicode way (attention, seems not maintained).