Search code examples
c#replaceout-of-memoryflat-file

'System.OutOfMemoryException' when using ReadAllText() method


I have a tab delimited file with 8,000,000+ rows that have some rogue tabs.

For example:

a->b->c->d
a->b->c->-->-->--d
a->b->c->d
a->b->c->d

I have a method to rectify the rogue tabs (3 tabs to 1 tab) as follows:

string text = File.ReadAllText(filePath);
text = text.Replace("\t\t\t", "\t");
File.WriteAllText(filePath, text);

The above code block produces the following error:

An unhandled exception of type 'System.OutOfMemoryException' occurred in mscorlib.dll

How can I read and and write just one row at a time so that the whole file is not in memory?


Solution

  • File.ReadLines gives you a lazy IEnumerable<string>. You can iterate over that instead and only load one line at a time.

    You'll need to write to a different file than you read from, though. You can delete/rename when you finish.

    Here's a one-liner that processes the file:

    File.WriteAllLines(outputFile, 
        File.ReadLines(inputFile).
        Select(t => t.Replace("\t\t\t", "\t"))
    );