Search code examples
c#performance

C#: Dictionary or list of pairs


I'm reading a large amount of data that I need to store as pairs of corresponding values. Then I'll need to look up the corresponding values for the keys, once for almost each key.

Since I'll be searching for each key only once, is it worth it to use a Dictionary, or would it be better in terms of performance to use List<Pair<...>> here?


Solution

  • It depends on how much data you have and how complicated it is to calculate the dictionary hashes.

    Your average lookup time for the List will be N/2. For the Dictionary it will always be the same hash resolution, but these hash resolutions can be fairly expensive. Therefore, for any data set there exists some number N where the List's average lookup cost starts to exceed the hash resolution. Below this number, the List is faster. Above the number, the Dictionary is faster.

    Where this breaking point happens depends on your data. I recall many years ago we used to use just 10 items as a rule of thumb, but I have no idea how valid that really is. Really, though, the best way to find out is to actually try both on a sample of your real data.

    We also need to throw in sorting as a further wrinkle. If you know you'll need to find every item in a list exactly once, then the total cost for the basic unsorted data is N items * N/2 average lookup time. This is a O(n2) algorithm, which is not great. But very often you can sort the data so you know the look-ups will happen sequentially, and then walk the List. This can be MUCH more efficient — O(n*log(n)) — and is more likely to beat the dictionary.