Search code examples
c++utf-8utf8-decode

UTF8 char to hex value string


I need a way to convert chars into hex values as strings.

I've tried a few ways but all of them just ignored UTF8 characters.

For example:

Take character:

Ş

If its converted correctly, its hex value is 0x15E but this code just returns me 0x3F which is just character ?.

wchar_t mychar = 'Ş';
cout << hex << setw(2) << setfill('0') 
                  << static_cast<unsigned int>(mychar);

I've found a javascript function which exactly what i need but couldn't convert it into c++ Here

Thanks


Solution

  • The problem is that you are assigning a char literal to wchar_t mychar. Because char is only one byte long it cannot store the character Ş. You have to prefix the literal with L, like this:

    wchar_t mychar = L'Ş';
    

    A very good article about Unicode, encodings, etc. is The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolsky.