Search code examples
c#linqdistinct

Can you create a simple 'EqualityComparer<T>' using a lambda expression


Short question:

Is there a simple way in LINQ to objects to get a distinct list of objects from a list based on a key property on the objects.

Long question:

I am trying to do a Distinct() operation on a list of objects that have a key as one of their properties.

class GalleryImage {
   public int Key { get;set; }
   public string Caption { get;set; }
   public string Filename { get; set; }
   public string[] Tags {g et; set; }
}

I have a list of Gallery objects that contain GalleryImage[].

Because of the way the webservice works [sic] I have duplicates of the GalleryImage object. i thought it would be a simple matter to use Distinct() to get a distinct list.

This is the LINQ query I want to use :

var allImages = Galleries.SelectMany(x => x.Images);
var distinctImages = allImages.Distinct<GalleryImage>(new 
                     EqualityComparer<GalleryImage>((a, b) => a.id == b.id));

The problem is that EqualityComparer is an abstract class.

I dont want to :

  • implement IEquatable on GalleryImage because it is generated
  • have to write a separate class to implement IEqualityComparer as shown here

Is there a concrete implementation of EqualityComparer somewhere that I'm missing?

I would have thought there would be an easy way to get 'distinct' objects from a set based on a key.


Solution

  • (There are two solutions here - see the end for the second one):

    My MiscUtil library has a ProjectionEqualityComparer class (and two supporting classes to make use of type inference).

    Here's an example of using it:

    EqualityComparer<GalleryImage> comparer = 
        ProjectionEqualityComparer<GalleryImage>.Create(x => x.id);
    

    Here's the code (comments removed)

    // Helper class for construction
    public static class ProjectionEqualityComparer
    {
        public static ProjectionEqualityComparer<TSource, TKey>
            Create<TSource, TKey>(Func<TSource, TKey> projection)
        {
            return new ProjectionEqualityComparer<TSource, TKey>(projection);
        }
    
        public static ProjectionEqualityComparer<TSource, TKey>
            Create<TSource, TKey> (TSource ignored,
                                   Func<TSource, TKey> projection)
        {
            return new ProjectionEqualityComparer<TSource, TKey>(projection);
        }
    }
    
    public static class ProjectionEqualityComparer<TSource>
    {
        public static ProjectionEqualityComparer<TSource, TKey>
            Create<TKey>(Func<TSource, TKey> projection)
        {
            return new ProjectionEqualityComparer<TSource, TKey>(projection);
        }
    }
    
    public class ProjectionEqualityComparer<TSource, TKey>
        : IEqualityComparer<TSource>
    {
        readonly Func<TSource, TKey> projection;
        readonly IEqualityComparer<TKey> comparer;
    
        public ProjectionEqualityComparer(Func<TSource, TKey> projection)
            : this(projection, null)
        {
        }
    
        public ProjectionEqualityComparer(
            Func<TSource, TKey> projection,
            IEqualityComparer<TKey> comparer)
        {
            projection.ThrowIfNull("projection");
            this.comparer = comparer ?? EqualityComparer<TKey>.Default;
            this.projection = projection;
        }
    
        public bool Equals(TSource x, TSource y)
        {
            if (x == null && y == null)
            {
                return true;
            }
            if (x == null || y == null)
            {
                return false;
            }
            return comparer.Equals(projection(x), projection(y));
        }
    
        public int GetHashCode(TSource obj)
        {
            if (obj == null)
            {
                throw new ArgumentNullException("obj");
            }
            return comparer.GetHashCode(projection(obj));
        }
    }
    

    Second solution

    To do this just for Distinct, you can use the DistinctBy extension in MoreLINQ:

        public static IEnumerable<TSource> DistinctBy<TSource, TKey>
            (this IEnumerable<TSource> source,
             Func<TSource, TKey> keySelector)
        {
            return source.DistinctBy(keySelector, null);
        }
    
        public static IEnumerable<TSource> DistinctBy<TSource, TKey>
            (this IEnumerable<TSource> source,
             Func<TSource, TKey> keySelector,
             IEqualityComparer<TKey> comparer)
        {
            source.ThrowIfNull("source");
            keySelector.ThrowIfNull("keySelector");
            return DistinctByImpl(source, keySelector, comparer);
        }
    
        private static IEnumerable<TSource> DistinctByImpl<TSource, TKey>
            (IEnumerable<TSource> source,
             Func<TSource, TKey> keySelector,
             IEqualityComparer<TKey> comparer)
        {
            HashSet<TKey> knownKeys = new HashSet<TKey>(comparer);
            foreach (TSource element in source)
            {
                if (knownKeys.Add(keySelector(element)))
                {
                    yield return element;
                }
            }
        }
    

    In both cases, ThrowIfNull looks like this:

    public static void ThrowIfNull<T>(this T data, string name) where T : class
    {
        if (data == null)
        {
            throw new ArgumentNullException(name);
        }
    }