Search code examples
c#iequalitycomparer

Questions about IEqualityComparer<T> / List<T>.Distinct()


Here is the equality comparer I just wrote because I wanted a distinct set of items from a list containing entities.

    class InvoiceComparer : IEqualityComparer<Invoice>
    {
        public bool Equals(Invoice x, Invoice y)
        {
            // A
            if (Object.ReferenceEquals(x, y)) return true;

            // B
            if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null)) return false;

            // C
            return x.TxnID == y.TxnID;
        }

        public int GetHashCode(Invoice obj)
        {
            if (Object.ReferenceEquals(obj, null)) return 0;
            return obj.TxnID2.GetHashCode();
        }
    }
  1. Why does Distinct require a comparer as opposed to a Func<T,T,bool>?
  2. Are (A) and (B) anything other than optimizations, and are there scenarios when they would not act the expected way, due to subtleness in comparing references?
  3. If I wanted to, could I replace (C) with

    return GetHashCode(x) == GetHashCode(y)


Solution

    1. So it can use hashcodes to be O(n) as opposed to O(n2)
    2. (A) is an optimization.
      (B) is necessary; otherwise, it would throw an NullReferenceException. If Invoice is a struct, however, they're both unnecessary and slower.
    3. No. Hashcodes are not unique