I am trying to compare (values of the properties) a instance of type in a List and eliminate duplicates. According to MSDN GetHashCode() is one of the way to compare two objects.
A hash code is intended for efficient insertion and lookup in collections that are based on a hash table. A hash code is not a permanent value
Considering that, I started writing my extension method as bellow
public static class Linq
{
public static IEnumerable<T> DistinctObjects<T>(this IEnumerable<T> source)
{
List<T> newList = new List<T>();
foreach (var item in source)
{
if(newList.All(x => x.GetHashCode() != item.GetHashCode()))
newList.Add(item);
}
return newList;
}
}
This condition always gives me false
though the data of the object is same.
newList.All(x => x.GetHashCode() != item.GetHashCode())
Finally I would like to use it like
MyDuplicateList.DistinctObjects().ToList();
If comparing all fields of the object is too much, I am okay to use it like,
MyDuplicateList.DistinctObjects(x=>x.Id, x.Name).ToList();
Here I am telling compare only these two fields of those objects.
After reading your comments I would propose this solution:
public static IEnumerable<TSource> DistinctBy<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TResult> selector)
{
HashSet<TResult> set = new HashSet<TResult>();
foreach(var item in source)
{
var selectedValue = selector(item);
if (set.Add(selectedValue))
yield return item;
}
}
Then you can use it like this:
var distinctedList = myList.DistinctBy(x => x.A);
or for multiple properties like that:
var distinctedList = myList.DistinctBy(x => new {x.A,x.B});
The advantage of this solution is you can exactly specify what properties should be used in distinction and you don't have to override Equals
and GetHashCode
for every object. You need to make sure that your properties can be compared.