I have a very large (hundreds of thousands) hashset of Customer objects in my database. Then I get a newly imported hashset of customer objects and have to check for every new object, if it is contained in the existing hashset. Performance is very important.
I cannot use the default Equalitycomparer as it needs to be compared based on only three properties. Also, I can't override the Equals and GetHashCode functions of the Customer class for other reasons. So I aimed for a custom EqualityComparer (I tried implementing IEqualityComparer or inheriting from EqualityComparer and overriding like you see below - both with the same end result).
public class CustomerComparer : EqualityComparer<Customer>
{
public CustomerComparer(){ }
public override bool Equals(Customer x, Customer y)
{
return x != null &&
y != null &&
x.Name == y.Name &&
x.Description == y.Description &&
x.AdditionalInfo == y.AdditionalInfo
}
public override int GetHashCode(Customer obj)
{
var hashCode = -1885141022;
hashCode = hashCode * -1521134295 + EqualityComparer<string>.Default.GetHashCode(obj.Name);
hashCode = hashCode * -1521134295 + EqualityComparer<string>.Default.GetHashCode(obj.Description);
hashCode = hashCode * -1521134295 + EqualityComparer<string>.Default.GetHashCode(obj.AdditionalInfo);
return hashCode;
}
}
Now to my problem: When I use the default EqualityComparer, generally only the GetHashCode method of Customer is called and the performance for my use case is very good (1-2 seconds). When I use my custom EqualityComparer, the GetHashCode method is never called but always the Equals method. The performance for my use case is horrible (hours). See code below:
public void FilterImportedCustomers(ISet<Customer> dataBase, IEnumerable<Customer> imported){
var equalityComparer = new CustomerComparer();
foreach (var obj in imported){
//great performance, always calls Customer.GetHashCode
if (!dataBase.Contains(obj){
//...
}
//awful performance, only calls CustomerComparer.AreEqual
if (!dataBase.Contains(obj, equalityComparer))
//...
}
}
}
Does anyone have an idea, how I can solve this problem? That would be amazing, I'm really stuck trying to solve this huge performance problem.
EDIT :
I solved it by passing my EuqalityComparer when initializing the hashset! By using the constructor overload that takes an IEqualityComparer so var database = new HashSet(new CustomerComparer())
Thank you, guys!
I solved it by passing my EqualityComparer when initializing the hashset! Is used the constructor overload that takes an IEqualityComparer so var database = new HashSet(new CustomerComparer())
Thanks to Lee and NetMage who commented under my original post.