Search code examples
c#.netdictionaryhashcodehash-collision

How well does .NET dictionary resolve collisions?


I have a problem with a custom object that needs to be keyed for a table. I need to generate a unique numeric key. I'm having collision problems and I'm wondering if I can leverage a dictionary to help me. Assume I have an object like this:

class Thingy
{
    public string Foo;
    public string Bar;
    public string Others;
}

and so on with more fields. Lets say Foo and Bar are my key fields - if they're equal between two Thingys, then the two objects should be considered equal (one may represent an update to the other, with Others fields being updated.) So I have these:

public override bool Equals(object obj)
{
    Thingy thing = (Thingy)obj; // yes I do type check first
    return (this.Foo == thing.Foo && this.Bar == thing.Bar);
}

public override int GetHashCode()
{
    return (this.Foo + this.Bar).GetHashCode(); // using default string impl
}

so this works for the most part, but there are rare occasions where two Thingys that are actually different have the same hash code.

My question is this: could I use a Dictionary<Thingy, int> where I put in my Thingys, and use a sequential value coming out of the dictionary as my actual key? I'm wondering if the Dictionary, when detecting a rare hash code collision, will call my Equals method, determine that the objects are actually different, and store them differently. I imaging then when looking it up, it would see a bucket for that hash and search for the correct Thingy, again using Equals for comparison.

Is this the case with dictionary, or does it only resolve collisions where the hash code is different, but (hash % size) is the same? If this won't work, what might?


Solution

  • Hash collisions only affect performance, not integrity.

    A simple test would be to change GetHashCode() to simply return 1;. You'll note that the dictionary still behaves properly, but with any reasonable dataset, it will perform terribly.