Search code examples
c#linqiequalitycomparer

How does Linq Except Compare results


How does Except determine if 2 values are the same

I have the following code

var removes = collection.Except(values, comparer).ToList();
var adds = values.Except( collection, comparer).ToList();
foreach (var item in removes)
{
    collection.Remove(item);
}
foreach (var item in adds)
{
    collection.Add(item);
}

however items that the comparer say are equal are included in the except lists, so to see what is happening I put a break point in the Equals function and its not being called, only the GetHashCode() function

So what criteria is being used to compare the items, is it only if the hashes are different that it calls the equality function?

Edit: the comparer Class and compared class are

public class Lookup
{
    public static readonly IEqualityComparer<Lookup> DefaultComparer = new EqualityComparer();
    private class EqualityComparer : IEqualityComparer<Lookup>
    {
        public bool Equals(Lookup x, Lookup y)
        {
            if (x == null)
                return y == null;
            else if (y == null)
                return false;
            else
                return x.ID == y.ID
                    && x.Category == y.Category
                    && x.DisplayText == y.DisplayText
                    && MetaData.CollectionComparer.Equals(x.MetaData, y.MetaData);
        }

        public int GetHashCode(Lookup obj)
        {
            var rtn = new { obj.ID, obj.Category, obj.DisplayText, obj.MetaData }.GetHashCode();

            return rtn;
        }
    }
    [DataMember]
    public int ID { get; set; }
    [DataMember]
    public LookupType Category { get; set; }
    [DataMember]
    public string DisplayText { get; set; }
    [DataMember]
    public MetaData[] MetaData { get; set; }
}

Solution

  • is it only if the hashes are different that it calls the equality function?

    Yes, it's just like that. And that is for performance reasons (assuming that a GetHashCode implementation should always be much faster than an Equals implementation).

    If two objects have a different hash code, they surely won't be the same or equal objects, so there is no need to call Equals. Only if the hash codes are the same, Equals gets called to see if they are really equal or just have the same hash code by accident.

    So your GetHashCode implementation should always ensure that equal objects have the same hash code.

    Since your implementation of GetHashCode creates an instance of an anonymous type and call GetHashCode on that instance, the hash codes will always be different and so all your objects are different from each other.