Search code examples
c++utf-8special-charactersifstreamiso-8859-1

Reading ISO-8859 type file containing special characters such as é in C++



I'm trying to read a file which is encoded in ISO-8859(ansi), and it contains some west European characters such as "é".
When I try to read the file and output the result, all the special characters appear as �, whereas normal alphabets appear correctly.

If I convert the file to utf-8 format and then do the same job, everything works perfectly.
Does anyone have any idea to solve this problem? I tried to use wifstream and wstring instead of ifstream and string but didn't help much.

Here's my sample code:

#include <iostream>
#include <fstream>
#include <string>

using namespace std;

int main()
{
    ifstream myFS;
    myFS.open("test.txt", ios::in);
    string myString; 
    if(myFS.is_open()){
        while(myFS >> myString)
            cout << myString << endl;
    }
    myFS.close();
    return 0;
}

test.txt (ISO-8859-15 format) contains:

abcd éfg

result:

abcd 
�fg

Any advice will be appreciated. Thank you in advance!


+)
forgot to mention my system environment.
I'm using ubuntu 10.10(Maverick) console with g++ ver 4.4.5
Thanks!


Solution

  • Your console is set to use UTF-8, so when you just dump the file in ISO-8859-15 to the console using cout, it shows the wrong letters. Letters with ascii code <128 are the same in both encodings, which means all those characters will appear correctly on your screen.

    The output from the program is actually correct, it's just your console that's not set to display the output correctly.

    I'd also recommend using ios::binary on files that aren't all ascii, or you may have problems on other platforms later.