Search code examples
c++clang++wofstreamwifstream

How to write a non-English string to a file and read from that file with C++?


I want to write a std::wstring onto a file and need to read that content as std:wstring. This is happening as expected when the string as L"<Any English letter>". But the problem is happening when we have character like Bengali, Kannada, Japanese etc, any kind of non English letter. Tried various options like:

  1. Converting the std::wstring to std::string and write onto the file and reading time read as std::string and convert as std::wstring
    • Writing is happening (I could see from edito) but reading time getting wrong character
  2. Writing std::wstring onto wofstream, this is also not helping for native language character letters like std::wstring data = L"হ্যালো ওয়ার্ল্ড";

Platform is mac and Linux, Language is C++

Code:

bool
write_file(
    const char*         path,
    const std::wstring  data
) {
    bool status = false;
    try {
        std::wofstream file(path, std::ios::out|std::ios::trunc|std::ios::binary);
        if (file.is_open()) {
            //std::string data_str = convert_wstring_to_string(data);
            file.write(data.c_str(), (std::streamsize)data.size());
            file.close();
            status = true;
        }
    } catch (...) {
        std::cout<<"exception !"<<std::endl;
    }
    return status;
}


// Read Method

std::wstring
read_file(
    const char*  filename
) {
    std::wifstream fhandle(filename, std::ios::in | std::ios::binary);
    if (fhandle) {
        std::wstring contents;
        fhandle.seekg(0, std::ios::end);
        contents.resize((int)fhandle.tellg());
        fhandle.seekg(0, std::ios::beg);
        fhandle.read(&contents[0], contents.size());
        fhandle.close();
        return(contents);
    }
    else {
        return L"";
    }
}

// Main

int main()
{
  const char* file_path_1 = "./file_content_1.txt";
  const char* file_path_2 = "./file_content_2.txt";

  //std::wstring data = L"Text message to write onto the file\n";  // This is happening as expected
  std::wstring data = L"হ্যালো ওয়ার্ল্ড";
// Not happening as expected.

  // Lets write some data
  write_file(file_path_1, data);
 // Lets read the file
 std::wstring out = read_file(file_path_1);

 std::wcout<<L"File Content: "<<out<<std::endl;
 // Let write that same data onto the different file
 write_file(file_path_2, out);
 return 0;
}

Solution

  • How a wchar_t is output depends on the locale. The default locale ("C") generally doesn't accept anything but ASCII (Unicode code points 0x20...0x7E, plus a few control characters.)

    Any time a program handles text, the very first statement in main should be:

    std::locale::global( std::locale( "" ) );
    

    If the program uses any of the standard stream objects, the code should also imbue them with the global locale, before any input or output.