Search code examples
c#textencoding

Appending to txt file produces strange characters


My code is:

File.AppendAllText(
       @"C: \Users\aalig\Downloads\stok.txt", 
       "text content" + Environment.NewLine, 
       Encoding.UTF8);

But after the writing text file looks that - original text goes first correctly (expected), but followed by hieroglyphs instead of "text content":

Image of invalid text


Solution

  • Source file has different encoding (not UTF8 as code specifies). When AppendAllText adds text with UTF8 as requested it will append UTF8 encoded string. When such sequence of bytes read with another encoding (i.e. UTF16) it will be interpreted as different set of Unicode characters.

    Two possible fixes :

    1. use encoding of the file when appending text (if you know it)
    2. read all text first and than re-write file with encoding of your choice.

    Sample that produces invalid result:

    string path = @"e:\temp\MyTest.txt";
    File.WriteAllText(path, "Hello and Welcome" + Environment.NewLine, Encoding.Unicode);
    // Note that AppendAllText uses different encoding than WriteAllText
    // To correct - specify the same Encoding.Unicode (option 1)
    File.AppendAllText(path, "text content" + Environment.NewLine, Encoding.UTF8);
    
    Console.WriteLine(File.ReadAllText(path));
    

    results in

    Hello and Welcome
    整瑸挠湯整瑮਍
    

    Sample that reads whole file and append your text (option 2)

    File.WriteAllText(path, 
       File.ReadAllText(path) 
       + Environment.NewLine 
       + newContentToAdd);