Search code examples
c#.netperformancepoker

Optimizing Lookups: Dictionary key lookups vs. Array index lookups


I'm writing a 7 card poker hand evaluator as one of my pet projects. While trying to optimize its speed (I like the challenge), I was shocked to find that the performance of Dictionary key lookups was quite slow compared to array index lookups.

For example, I ran this sample code that enumerates over all 52 choose 7 = 133,784,560 possible 7 card hands:

var intDict = new Dictionary<int, int>();
var intList = new List<int>();
for (int i = 0; i < 100000; i ++)
{
    intDict.Add(i, i);  
    intList.Add(i);
}

int result;

var sw = new Stopwatch();
sw.Start();
for (int card1 = 0; card1 < 46; card1++)
  for (int card2 = card1 + 1; card2 < 47; card2++)
    for (int card3 = card2 + 1; card3 < 48; card3++)
      for (int card4 = card3 + 1; card4 < 49; card4++)
        for (int card5 = card4 + 1; card5 < 50; card5++)
          for (int card6 = card5 + 1; card6 < 51; card6++)
            for (int card7 = card6 + 1; card7 < 52; card7++)
              result = intDict[32131]; // perform C(52,7) dictionary key lookups
sw.Stop();
Console.WriteLine("time for dictionary lookups: {0} ms", sw.ElapsedMilliseconds);

sw.Reset();

sw.Start();
for (int card1 = 0; card1 < 46; card1++)
  for (int card2 = card1 + 1; card2 < 47; card2++)
    for (int card3 = card2 + 1; card3 < 48; card3++)
      for (int card4 = card3 + 1; card4 < 49; card4++)
        for (int card5 = card4 + 1; card5 < 50; card5++)
          for (int card6 = card5 + 1; card6 < 51; card6++)
            for (int card7 = card6 + 1; card7 < 52; card7++)
              result = intList[32131]; // perform C(52,7) array index lookups
sw.Stop();
Console.WriteLine("time for array index lookups: {0} ms", sw.ElapsedMilliseconds);

which outputs:

time for dictionary lookups: 2532 ms
time for array index lookups: 313 ms

Is this type of behavior expected (performance decrease by a factor of 8)? IIRC, a Dictionary has, on average, O(1) lookups, while an array has worst-case O(1) lookups, so I do expect the array lookups to be faster, but not by this much!

I am currently storing poker hand rankings in a Dictionary. I suppose if this is as fast as the dictionary lookups can be, I have to rethink my approach and use arrays instead, although indexing the rankings will get a little tricky and I'll probably have to ask another question about it.


Solution

  • Don't forget that Big-O notations only says how the complexity grows with respect to the size (etc) - it doesn't give any indication of the constant factors involved. That's why sometimes even a linear search for keys is faster than a dictionary lookup, when there are sufficiently few keys. In this case you're not even doing a search with the array though - just a straight indexing operation.

    For straight index lookups, arrays are basically ideal - it's just a case of

    pointer_into_array = base_pointer + offset * size
    

    (And then a pointer dereference.)

    Performing a dictionary lookup is relatively complicated - very fast compared with (say) a linear lookup by key when there are lots of keys, but much more complicated than a straight array lookup. It has to calculate the hash of the key, then work out which bucket that should be in, possibly deal with duplicate hashes (or duplicate buckets) and then check for equality.

    As always, choose the right data structure for the job - and if you really can get away with just indexing into an array (or List<T>) then yes, that will be blindingly fast.