Search code examples

Open and read thousands of files as fast as possible

I need to open and read thousands of files as fast as possible.

I have ran a few tests on 13 592 files and found Method 1 to be slightly faster than Method 2. These files are usually between 800 bytes and 4kB. I would like to know if there is anything I can do to make this I/O-bound process faster?

Method 1:
    Run 1: 3:05 (don't know what happened here)
    Run 2: 1:55
    Run 3: 2:06
    Run 4: 2:02
Method 2:
    Run 1: 2:04
    Run 2: 2:08
    Run 3: 2:04
    Run 4: 2:12

Here's the code:

public class FileOpenerUtil

    /// <summary>
    /// </summary>
    /// <param name="fullFilePath"></param>
    /// <returns></returns>
    public static string ReadFileToString(string fullFilePath)
        while (true)
                //Methode 1
                using (StreamReader sr = File.OpenText(fullFilePath))
                    string fullMessage = "";
                    string s;
                    while ((s = sr.ReadLine()) != null)
                        fullMessage += s + "\n";
                    return RemoveCarriageReturn(fullMessage);
                //Methode 2
                /*using (File.Open(fullFilePath, FileMode.Open, FileAccess.Read, FileShare.Read))
                    Console.WriteLine("Output file {0} ready.", fullFilePath);
                    string[] lines = File.ReadAllLines(fullFilePath);
                    //Every new line under the previous line
                    string fullMessage = lines.Aggregate("", (current, s) => current + s + "\n");
                    return RemoveCarriageReturn(fullMessage);
                    //ninject kernel

                //Methode 3

            catch (FileNotFoundException ex)
                Console.WriteLine("Output file {0} not yet ready ({1})", fullFilePath, ex.Message);
            catch (IOException ex)
                Console.WriteLine("Output file {0} not yet ready ({1})", fullFilePath, ex.Message);
            catch (UnauthorizedAccessException ex)
                Console.WriteLine("Output file {0} not yet ready ({1})", fullFilePath, ex.Message);


    /// <summary>
    /// Verwijdert '\r' in een string sequence
    /// </summary>
    /// <param name="message">The text that has to be changed</param>
    /// <returns>The changed text</returns>
    private static string RemoveCarriageReturn(string message)
        return message.Replace("\r", "");

The files I'm reading are .HL7 files and look like this:

MSH|^~\&|OAZIS||||20150430235954||ADT^A03|23669166|P|2.3||||||ASCII EVN|A03|20150430235954||||201504302359 PID|1||6001144000||LastName^FirstName^^^Mevr.|LastName^FirstName|19600114|F|||GStreetName Number^^City^^PostalCode^B^H||09/3444556^^PH~0476519246echtg^^CP||NL|M||28783409^^^^VN|0000000000|60011402843||||||B||||N PD1||||003847^LastName^FirstName||||||||N|||0 PV1|1|O|FDAG^000^053^001^0^2|NULL||FDAG^000^053^001|003847^LastName^FirstName||006813^LastName^FirstName|1900|00||||||006813^LastName^FirstName|0|28783409^^^^VN|1^20150430|01|||||||||||||||1|1||D|||||201504301336|201504302359 OBX|1|CE|KIND_OF_DIS|RCM|1^1 Op medisch advies OBX|2|CE|DESTINATION_DIS|RCM|1^1 Terug naar huis

Once I opened the file, I parse the string with j4jayant's HL7 parser and close the file.


  • I used 50,000 files of varying size (500 to 1024 bytes).

    Test 1: Your method 1 StreamReader sr = File.OpenText(fullFilePath); sr.ReadLine();
    Seconds: 3,4658937968113
    Test 2: Your method 2 File.ReadAllLines(fullFilePath)
    Seconds: 5,5008349279222
    Test 3: File.ReadAllText(fullFilePath);
    Seconds: 3,30782645637133
    Test 4: BinaryReader b = new BinaryReader; b.ReadString();
    Seconds: 5,85779941381009
    Test 5: Windows FileReader (
    Seconds: 3,07036554759848
    Test 6: StreamReader sr = File.OpenText(fullFilePath); sr.ReadToEnd();
    Seconds: 3,31464109255517
    Test 7: StreamReader sr = File.OpenText(fullFilePath); sr.ReadToEnd();
    Seconds: 3,3364683664508
    Test 8: StreamReader sr = File.OpenText(fullFilePath); sr.ReadLine();
    Seconds: 3,40426888695317
    Test 9: FileStream + BufferedStream + StreamReader
    Seconds: 4,02871911079061
    Test 10: Parallel.For using code File.ReadAllText(fullFilePath);
    Seconds: 0,89543632235447

    Best test results are Test 5 and Test 3 (single thread)
    Test 3 is using: File.ReadAllText(fullFilePath);
    Test 5 uses Windows FileReader (

    If you can use threads Test 10 is by far the quickest.


    int maxFiles = 50000;
    int j = 0;
    Parallel.For(0, maxFiles, x =>
        Util.Method1("readtext_" + j + ".txt"); // your read method

    When using RAMMap to empty the standby list:

    Test 1: Your method 1 StreamReader sr = File.OpenText(fullFilePath); sr.ReadLine();
    Seconds: 15,1785750622961
    Test 2: Your method 2 File.ReadAllLines(fullFilePath)
    Seconds: 17,650864469466
    Test 3: File.ReadAllText(fullFilePath);
    Seconds: 14,8985912878328
    Test 4: BinaryReader b = new BinaryReader; b.ReadString();
    Seconds: 18,1603815767866
    Test 5: Windows FileReader
    Seconds: 14,5059765845334
    Test 6: StreamReader sr = File.OpenText(fullFilePath); sr.ReadToEnd();
    Seconds: 14,8649786336991
    Test 7: StreamReader sr = File.OpenText(fullFilePath); sr.ReadToEnd();
    Seconds: 14,830567197641
    Test 8: StreamReader sr = File.OpenText(fullFilePath); sr.ReadLine();
    Seconds: 14,9965866575751
    Test 9: FileStream + BufferedStream + StreamReader
    Seconds: 15,7336450516575
    Test 10: Parallel.For() using code File.ReadAllText(fullFilePath);
    Seconds: 4,11343060325439