IEqualityComparer in the namespace System.Collections.Generic
has following methods:
bool Equals(T x, T y);
int GetHashCode(T obj);
Since this inteface is used to check equality of objects, the first method Equals
makes sense. But why do we need to implement GetHashCode
also? Why does it exist in the interface in the first place? When is it needed and why?
I'm using it with Enumerable.Distinct() method in the namespace System.Linq
, and I'm surprised to see that even GetHashCode()
is getting called, along with Equals()
. Why? How does Distinct
work?
For details on how Distinct
works (or at least a simple example implementation) see my Edulinq blog post on it (old - 404).
To put it simply, a hash code which corresponds to the appropriate equality comparison makes it cheaper to create a set of items. That's useful in a lot of situations - such as Distinct
, Except
, Intersect
, Union
, Join
, GroupJoin
, GroupBy
, ToLookup
and so on.