Search code examples
c#performancealgorithmwords

How to load a file with words into a list where the file has over 3 million lines


Is it possible to load a file with 3 or 4 million lines in less than 1 second (1.000000)? One line contains one word. Words range in length from 1 - 17 (does that matter?).

My code is now:

List<string> LoadDictionary(string filename)
{
    List<string> wordsDictionary = new List<string>();

    Encoding enc = Encoding.GetEncoding(1250);//I need ę ą ć ł etc.
    using (StreamReader r = new StreamReader(filename, enc))
    {
        string line = "";
        while ((line = r.ReadLine()) != null)
        {
            if (line.Length > 2)
            {
                wordsDictionary.Add(line);
            }
        }
    }

    return wordsDictionary;
}

Results of timed execution:

time of loading 4 million words - pic result

How can I force the method to make it execute in half the time?


Solution

  • If you know that your list will be large, you should set a good starting capacity.

    List<string> wordsDictionary = new List<string>( 100000 );
    

    If you don't do this, the list will need to keep increasing its capacity which takes a bit of time. Likely won't cut this down by half, but it's a start