Search code examples
c#utf-8ansi

Unable to convert special characters in UTF-8 file into ANSI


I have a file that needs to be read and a text has to be added at the end. The program failed due to character "í" . On opening the file in notepad++ (UTF-8) encoding, I could see enter image description here

In my C# code I tried to convert it to Default encoding, but the application changes it to "?" instead of "í".

Sample code:

string processFilePath = @"D:\Test\File1.txt";
string outfile = @"D:\Test\File2.txt";

using (StreamReader reader = new StreamReader(processFilePath))
{
    using (StreamWriter writer = new StreamWriter(outfile, false, Encoding.Default))
    {
        writer.WriteLine(reader.ReadToEnd());
    }
}

                

I looked into similar questions on SO (above code snipped was the modified version from here): UTF-8 to ANSI Conversion using C#

I tried different types of encoding available in the "System.Text.Encoding" - ASCII/ UTF*/ Default but the best I could get is a "?" instead of "í".

I had also gone through : http://kunststube.net/encoding/ , I did learn a lot, but was still unable to resolve the issue.

What I am getting: enter image description here

What I need: enter image description here

On Microsoft website: enter image description here

What else am I missing (Should have been easy if System.Text.Encoding.ANSI existed )


Solution

  • MSDN:

    StreamReader defaults to UTF-8 encoding unless specified otherwise, instead of defaulting to the ANSI code page for the current system.

    i.e. when opening StreamReader(processFilePath) it takes data as in UTF-8, which seems not the case, i.e. if the source text is ANSI, or most likely Windows-1252 for Spanish, use

    using (StreamReader reader = new StreamReader(processFilePath, Encoding.GetEncoding(1252)))
    {
        using (StreamWriter writer = new StreamWriter(outfile, false, Encoding.UTF8))
        {
            writer.WriteLine(reader.ReadToEnd());
        }
    } 
    

    Note specified 1252 and UTF8.

    P.S. Also note that false in StreamWriter will not append to the end, but overwrite.