Search code examples
c#file-encodings

c# change file encoding without loading all the file in memory


I need to change a file's encoding. The method that I've used loads all the file in memory:

string DestinationString = Encoding.ASCII.GetString(Encoding.ASCII.GetBytes(File.ReadAllText(FileName)));
File.WriteAllText(FileName, DestinationString, new System.Text.ASCIIEncoding());

This works for smaller files (in case that I want to change the file's encoding to ASCII), but it won't be ok with files larger than 2 GB. How to change the encoding without loading all the file's content in memory?


Solution

  • You can't do so by writing to the same file - but you can easily do it to a different file, just by reading a chunk of characters at a time in one encoding and writing each chunk in the target encoding.

    public void RewriteFile(string source, Encoding sourceEncoding,
                            string destination, Encoding destinationEncoding)
    {
        using (var reader = File.OpenText(source, sourceEncoding))
        {
            using (var writer = File.CreateText(destination, destinationEncoding))
            {
                char[] buffer = new char[16384];
                int charsRead;
                while ((charsRead = reader.Read(buffer, 0, buffer.Length)) > 0)
                {
                    writer.Write(buffer, 0, charsRead);
                }
            }
        }
    }
    

    You could always end up with the original filename via renaming, of course.