Search code examples
c#stringdictionarygethashcode

What key for a dictionary keyed by interned strings


I wish to associate a memory cached data structure with a set of interned strings and use a passed instance of an interned string to lookup its associated data structure.

The predefined set of strings will be around 1000 in number. Cache population costs can be ignored but I want high performance lookup.

public class InternedExtras
{
  public DateTime Prop1 {get; set; }
  public Decimal Prop2 {get; set; }
}

Ideally I would create a Dictionary keyed on an interned string's reference but .Net does not expose object references as a specific type.

If I declare my Dictionary as:

Dictionary<string, InternedExtras>

then I am concerned that the System.String equality override will invoke char by char string value comparison during dictionary lookup, which will be inefficient.

An option would be:

Dictionary<int, InternedExtras> _extrasDictionary

InternedExtras GetInternedExtras( string knownToBeInterned )
{
  return _extrasDictionary[ knownToBeInterned.GetHashCode() ];
}

However I have never fully understood hash code maths and understand uniqueness is not guaranteed.

The average length of my interned strings is 50 chars and I can deploy to the latest .Net version.


Solution

  • I actually think this is your most efficient option:

    Dictionary<string, InternedExtras> _extrasDictionary;
    

    Doing a looking as follows is actually very efficient!

    InternedExtras extras = _extrasDictionary[interned];
    

    The char by char comparison that you refer to will only be called on a small subset of strings. This is because interned.GetHashCode() will be used to group they keys into "buckets".

    This question has much more details on the subject:

    How does a hash table work?