Search code examples
c#linqoptimizationenumerablereference-source

LINQ Enumerable strategy comparison: Any, Count, SingleOrDefault


We've had a little introspection over at Code Review of LINQ Enumerable and discovered different strategies for iterating a IEnumerable<TSource> to yield us some information about it.

Any

public static bool Any<TSource>(
     this IEnumerable<TSource> source, Func<TSource, bool> predicate) 
{
     if (source == null) throw Error.ArgumentNull("source");
     if (predicate == null) throw Error.ArgumentNull("predicate");
     foreach (TSource element in source) {
         if (predicate(element)) return true;
     }
     return false;
 }

 public static bool Any<TSource>(this IEnumerable<TSource> source)
 {
      if (source == null) throw Error.ArgumentNull("source");
      using (IEnumerator<TSource> e = source.GetEnumerator()) {
          if (e.MoveNext()) return true;
      }
      return false;
  }

Count

public static int Count<TSource>(this IEnumerable<TSource> source) 
{
    if (source == null) throw Error.ArgumentNull("source");
    ICollection<TSource> collectionoft = source as ICollection<TSource>;
    if (collectionoft != null) return collectionoft.Count;
    ICollection collection = source as ICollection;
    if (collection != null) return collection.Count;
    int count = 0;
    using (IEnumerator<TSource> e = source.GetEnumerator()) {
        checked {
            while (e.MoveNext()) count++;
        }
    }
    return count;
}

SingleOrDefault

public static TSource SingleOrDefault<TSource>(
    this IEnumerable<TSource> source, Func<TSource, bool> predicate) 
{
    if (source == null) throw Error.ArgumentNull("source");
    if (predicate == null) throw Error.ArgumentNull("predicate");
    TSource result = default(TSource);
    long count = 0;
    foreach (TSource element in source) {
        if (predicate(element)) {
            result = element;
            checked { count++; }
        }
    }
    switch (count) {
        case 0: return default(TSource);
        case 1: return result;
    }
    throw Error.MoreThanOneMatch();
}

As you can see, 3 different strategies were used.

  • Any: Iterate the enumerator, with early exit
  • Count: Iterate the enumerator, unless ICollection, call Count
  • SingleOrDefault: Iterate the enumerator, without early exit

Some observations:

  • 'Any()' could have also been optimized for ICollection
  • 'SingleOrDefault' could have also been optimized to exit early
  • none of the methods care about IReadOnlyCollection, even though it also has a property 'Count'

Questions:

  • what are the pro's and contra's of each of these strategies?
  • is there a good reason why the implementation of LINQ is as it is?

Solution

  • Well, the first two methods are self-explanatory, aren't they? They are optimized in a way that they stop as soon as possible and also check if the type has a Count property to avoid the loop.

    But the SingleOrDefault with predicate has indeed a strange implementation. It could stop at the 2nd matching item since then is clear that a InvalidOperationException must be thrown(Single... as opposed to First... ensures that there is at maximum 1 item). But instead it checks every item and counts the matches. The version without predicate has this optimization.

    So the question is: What the heck? And it seems that is really a bug that won't be fixed since it's just reduces performance in error-case. I can't believe it.

    By the way, the same bug exists in Enumerable.Single.