Search code examples
c++utf-8character-encodingutf-16stdstring

Convert C++ std::string to UTF-16-LE encoded string


I've been searching for hours today and just can't find anything that works out for me. The one I've just had a look at, with no luck, is "How to convert UTF-8 encoded std::string to UTF-16 std::string".

My question is, with a brief explanation:

I want to make a valid NTLM hash in std C++, and I'm using OpenSSL's library to create the hash using its MD4 routines. I know how to do that, so does anyone know how to convert the std::string into a UTF-16 LE encoded string which I can pass to the MD4 functions to get a correct digest?

So, can I have a std::string which holds the char type, and convert it to a UTF16-LE encoded variable length std::string_type? Whether that be std::u16string, or std::wstring?

And would I use s.c_str() or s.data() and would the length() function report correctly in both cases?


Solution

  • Apologies, firsthand... this will be an ugly reply with some long code. I ended up using the following function, while effectively compiling in iconv into my windows application file by file :)

    Hope this helps.

    char* conver(const char* in, size_t in_len, size_t* used_len)
    {
        const int CC_MUL = 2; // 16 bit
        setlocale(LC_ALL, "");
        char* t1 = setlocale(LC_CTYPE, "");
        char* locn = (char*)calloc(strlen(t1) + 1, sizeof(char));
        if(locn == NULL)
        {
            return 0;
        }
    
        strcpy(locn, t1);
        const char* enc = strchr(locn, '.') + 1;
    
    #if _WINDOWS
        std::string win = "WINDOWS-";
        win += enc;
        enc = win.c_str();
    #endif
    
        iconv_t foo = iconv_open("UTF-16LE", enc);
    
        if(foo == (void*)-1)
        {
            if (errno == EINVAL)
            {
                fprintf(stderr, "Conversion from %s is not supported\n", enc);
            }
            else
            {
                fprintf(stderr, "Initialization failure:\n");
            }
            free(locn);
            return 0;
        }
    
        size_t out_len = CC_MUL * in_len;
        size_t saved_in_len = in_len;
        iconv(foo, NULL, NULL, NULL, NULL);
        char* converted = (char*)calloc(out_len, sizeof(char));
        char *converted_start = converted;
        char* t = const_cast<char*>(in);
        int ret = iconv(foo,
                        &t,
                        &in_len,
                        &converted,
                        &out_len);
        iconv_close(foo);
        *used_len = CC_MUL * saved_in_len - out_len;
    
        if(ret == -1)
        {
            switch(errno)
            {
            case EILSEQ:
                fprintf(stderr,  "EILSEQ\n");
                break;
            case EINVAL:
                fprintf(stderr,  "EINVAL\n");
                break;
            }
    
            perror("iconv");
            free(locn);
            return 0;
        }
        else
        {
            free(locn);
            return converted_start;
        }
    }