Search code examples
c++unicodewidestringwofstream

Typographic apostrophe + wide string literal broke my wofstream (C++)


I’ve just encountered some strange behaviour when dealing with the ominous typographic apostrophe ( ’ ) – not the typewriter apostrophe ( ' ). Used with wide string literal, the apostrophe breaks wofstream.

This code works

ofstream file("test.txt");
file << "A’B" ;
file.close();

==> A’B

This code works

wofstream file("test.txt");
file << "A’B" ;
file.close();

==> A’B

This code fails

wofstream file("test.txt");
file << L"A’B" ;
file.close();

==> A

This code fails...

wstring test = L"A’B";
wofstream file("test.txt");
file << test ;
file.close();

==> A

Any idea ?


Solution

  • You should "enable" locale before using wofstream:

    std::locale::global(std::locale()); // Enable locale support 
    wofstream file("test.txt");
    file << L"A’B";
    

    So if you have system locale en_US.UTF-8 then the file test.txt will include utf8 encoded data (4 byes), if you have system locale en_US.ISO8859-1, then it would encode it as 8 bit encoding (3 bytes), unless ISO 8859-1 misses such character.

    wofstream file("test.txt");
    file << "A’B" ;
    file.close();
    

    This code works because "A’B" is actually utf-8 string and you save utf-8 string to file byte by byte.

    Note: I assume you are using POSIX like OS, and you have default locale different from "C" that is the default locale.