Search code examples
c#ienumerableimplementationduplicates

C#: A good and efficient implementation of IEnumerable<T>.HasDuplicates


Does anyone have a good and efficient extension method for finding if a sequence of items has any duplicates?

Guess I could put return subjects.Distinct().Count() == subjects.Count() into an extension method, but kind of feels that there should be a better way. That method would have to count elements twice and sort out all the distict elements. A better implementation should return true on the first duplicate it finds. Any good suggestions?

I imagine the outline could be something like this:

public static bool HasDuplicates<T>(this IEnumerable<T> subjects)
{
    return subjects.HasDuplicates(EqualityComparer<T>.Default);
}

public static bool HasDuplicates<T>(this IEnumerable<T> subjects, IEqualityComparer<T> comparer)
{
    ...
}

But not quite sure how a smart implementation of it would be...


Solution

  • public static bool HasDuplicates<T>(this IEnumerable<T> subjects)
    {
        return HasDuplicates(subjects, EqualityComparer<T>.Default);
    }
    
    public static bool HasDuplicates<T>(this IEnumerable<T> subjects, IEqualityComparer<T> comparer)
    {
        HashSet<T> set = new HashSet<T>(comparer);
        foreach (T item in subjects)
        {
            if (!set.Add(item))
                return true;
        }
    
        return false;
    }