Search code examples
c++windowsunicodetext-filesutf-16

Is it possible to set a text file to UTF-16?


My code for writing text works for ANSI characters, but when I try to write Japanese characters they do not appear. Do I need to use UTF-16 encoding? If so, how would I do it on code?

std::wstring filename;
std::wstring text;
filename = "path";
wofstream myfile;
myfile.open(filename, ios::app);
getline(wcin, text);
myfile << text << endl;
wcin.get();
myfile.close();

Solution

  • From the comments it seems your console correctly understands Unicode, and the issue is only with file output.

    Here's how to write a text file in UTF-16LE. Just tested in MSVC 2019 and it works.

    #include <string>
    #include <fstream>
    #include <iostream>
    #include <codecvt>
    #include <locale>
    
    int main() {
        std::wstring text = L"test тест 試験.";
        std::wofstream myfile("test.txt", std::ios::binary);
        std::locale loc(std::locale::classic(), new std::codecvt_utf16<wchar_t, 0x10ffff, std::little_endian>);
        myfile.imbue(loc);
        myfile << wchar_t(0xFEFF) /* UCS2-LE BOM */;
        myfile << text << "\n";
        myfile.close();
    }
    

    You must use std::ios::binary mode for output under Windows, otherwise \n will break it by expanding to \r\n, ending up emitting 3 bytes instead of 2.

    You don't have to write the BOM at the beginning, but having one greatly simplifies opening the file using the correct encoding in text editors.

    Unfortunately, std::codecvt_utf16 is deprecated since C++17 with no replacement (yes, Unicode support in C++ is that bad).