Search code examples
c++windowsunicodeutf-8emoji

windows C++ simulating emoji input to a browser window - scrambled chars


I am using Visual Studio Community 2022 to build a windows C++ app that is trying to basically copy paste with one method or another a line from a text file into a Chrome browser's text field.

I have a UTF-8 encoded .txt file containing a sentence per line with an emoji at the end.

A sample emoji that I'm trying to use: 🤭

I read it into a vector or vector with code like: (I have tried both wstrings and strings)

    ifstream infile("Lines.txt");
    if(!infile.is_open()) return false;

    string aLine;
    // Read the database
    while (getline(infile, aLine))
    {
        theDB.push_back(aLine);
    }

    infile.close();

    return !theDB.empty();

I have then tried to send it with SendMessage(aWindow, WM_CHAR, aCharacter, 0);

and by copying it to the clipboard and then pasting it from there. (In both CF_TEXT and CF_UNICODETEXT format, with a method that I've used successfully for plain text and Spanish letters.)

I have also tried adding /utf8 in the compiler options.

No matter what the emojis always get scrambled, but with most settings the rest of the text works fine.


Solution

  • You need to convert your utf8 to wchar, luckily Windows has the MultiByteToWideChar function to help you with this, you'll then be able to copy to clipbaord using CF_UNICODETEXT.

    Here's the function I use.

    wchar_t* utf8toWChar(const char* utf8string)
    {
        const int buffsize = MultiByteToWideChar(CP_UTF8, 0, utf8string, -1, nullptr, 0);
    
        wchar_t* gah = static_cast<wchar_t*>(malloc((buffsize + 1) * sizeof(wchar_t)));
    
        MultiByteToWideChar(CP_UTF8, 0, utf8string, -1, gah, buffsize);
        gah[buffsize] = 0;
    
        return gah;
    }
    

    And here's the C++ version:

    std::wstring utf8towstring(const std::string& utf8string)
    {
       if (utf8string.empty())
       {
           return std::wstring();
       }
    
       int charactersWritten = ::MultiByteToWideChar(CP_UTF8, 0, utf8string.data(), (int)utf8string.size(), NULL, 0);
       if (0 == charactersWritten)
       { 
           return std::wstring();
       }
    
       std::wstring str2;
       str2.resize(charactersWritten);
    
       charactersWritten = ::MultiByteToWideChar(CP_UTF8, 0, utf8string.data(), (int)utf8string.size(), &str2[0], str2.capacity());
       if (0 == charactersWritten)
       {
          return std::wstring();
       }
    
       return str2;
    }
    

    Reading from UTF-8 encoded text file, converted using utf8towstring, and then copieg to clipboard:

    Output: trying to use: 🤭 That worked!