Search code examples
c#distinctequalityiequalitycomparer

IEqualityComparer fails using two properties


When I use this comparer in Distinct() it always returns false. Can't see a reason why.

public class IdEqualityComparer : IEqualityComparer<Relationship>
{
    public bool Equals(Relationship x, Relationship y)
    {
        if (x == null && y == null)
            return true;
        else if (x == null || y == null)
            return false;
        else if (x.ID == y.ID && x.RelatedID == y.RelatedID)
            return true;
        else
            return false;
    }

    public int GetHashCode(Relationship obj)
    {
        unchecked
        {
            int hash = (obj.ID ?? "").GetHashCode() ^ (obj.RelatedID ?? "").GetHashCode();
            return hash;
        }
    }
}

The hash seems correct to me, but the ID and RelatedID comparison never returns true.

It fails, as I can check the result afterward and the output is not distinct using those two properties.


Solution

  • Seems to work fine here;

    static void Main()
    {
        var objs = new[]
        {
            new Relationship { ID = "a", RelatedID = "b" }, // <----\
            new Relationship { ID = "a", RelatedID = "c" }, //      |
            new Relationship { ID = "a", RelatedID = "b" }, // dup--/
            new Relationship { ID = "d", RelatedID = "b" }, // <------\
            new Relationship { ID = "d", RelatedID = "c" }, //        |
            new Relationship { ID = "d", RelatedID = "b" }, // dup ---/ 
            new Relationship { ID = "b", RelatedID = "c" }, //
        };
    
        var count = objs.Distinct(new IdEqualityComparer()).Count();
        System.Console.WriteLine(count);
    }
    

    gives 5, not 7 (which we would expect if it always returned false). Tested with:

    public class Relationship
    {
        public string ID { get; set; }
        public string RelatedID { get; set; }
    }
    

    To illustrate this more clearly:

    var a = new Relationship { Id = "x", RelatedID = "foo" };
    var b = new Relationship { Id = "y", RelatedID = "foo" };
    var c = new Relationship { Id = "x", RelatedID = "foo" };
    

    we can now demonstrate that the comparer returns true and false appropriately:

    var comparer = new IdEqualityComparer();
    Console.WriteLine(comparer.Equals(a, b)); // False
    Console.WriteLine(comparer.Equals(a, c)); // True
    

    we can also show that the hash-code is working appropriately:

    Console.WriteLine(comparer.GetHashCode(a));
    Console.WriteLine(comparer.GetHashCode(b));
    Console.WriteLine(comparer.GetHashCode(c));
    

    note that the numbers will change, but for me on this run this gives:

    -789327704
    1132350395
    -789327704
    

    The numbers don't matter - what matters is that the first and last are equal, and (ideally) different from the middle one.

    So: the comparer is working fine, and the premise of the question is incorrect. You need to identify in your code what is different, and fix it.