Search code examples
c++winapiutf-16wstringvalueconverter

how can I convert wstring to u16string?


I want to convert wstring to u16string in C++.

I can convert wstring to string, or reverse. But I don't know how convert to u16string.

u16string CTextConverter::convertWstring2U16(wstring str)

{

        int iSize;
        u16string szDest[256] = {};
        memset(szDest, 0, 256);
        iSize = WideCharToMultiByte(CP_UTF8, NULL, str.c_str(), -1, NULL, 0,0,0);

        WideCharToMultiByte(CP_UTF8, NULL, str.c_str(), -1, szDest, iSize,0,0);
        u16string s16 = szDest;
        return s16;
}

Error in WideCharToMultiByte(CP_UTF8, NULL, str.c_str(), -1, szDest, iSize,0,0);'s szDest. Cause of u16string can't use with LPSTR.

How can I fix this code?


Solution

  • For a platform-independent solution see this answer.

    If you need a solution only for the Windows platform, the following code will be sufficient:

    std::wstring wstr( L"foo" );
    std::u16string u16str( wstr.begin(), wstr.end() );
    

    On the Windows platform, a std::wstring is interchangeable with std::u16string because sizeof(wstring::value_type) == sizeof(u16string::value_type) and both are UTF-16 (little endian) encoded.

    wstring::value_type = wchar_t
    u16string::value_type = char16_t
    

    The only difference is that wchar_t is signed, whereas char16_t is unsigned. So you only have to do sign conversion, which can be performed by using the u16string constructor that takes an iterator pair as arguments. This constructor will implicitly convert wchar_t to char16_t.

    Full example console application:

    #include <windows.h>
    #include <string>
    
    int main()
    {
        static_assert( sizeof(std::wstring::value_type) == sizeof(std::u16string::value_type),
            "std::wstring and std::u16string are expected to have the same character size" );
       
        std::wstring wstr( L"foo" );
        std::u16string u16str( wstr.begin(), wstr.end() );
       
        // The u16string constructor performs an implicit conversion like:
        wchar_t wch = L'A';
        char16_t ch16 = wch;
       
        // Need to reinterpret_cast because char16_t const* is not implicitly convertible
        // to LPCWSTR (aka wchar_t const*).
        ::MessageBoxW( 0, reinterpret_cast<LPCWSTR>( u16str.c_str() ), L"test", 0 );
       
        return 0;
    }