I was going through Jon Skeet's Reimplemnting Linq to Objects series. In the implementation of where article, I found the following snippets, but I don't get what is the advantage that we are gettting by splitting the original method into two.
Original Method:
// Naive validation - broken!
public static IEnumerable<TSource> Where<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
if (source == null)
{
throw new ArgumentNullException("source");
}
if (predicate == null)
{
throw new ArgumentNullException("predicate");
}
foreach (TSource item in source)
{
if (predicate(item))
{
yield return item;
}
}
}
Refactored Method:
public static IEnumerable<TSource> Where<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
if (source == null)
{
throw new ArgumentNullException("source");
}
if (predicate == null)
{
throw new ArgumentNullException("predicate");
}
return WhereImpl(source, predicate);
}
private static IEnumerable<TSource> WhereImpl<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
foreach (TSource item in source)
{
if (predicate(item))
{
yield return item;
}
}
}
Jon says - Its for eager validation and then defferring for the rest of the part. But, I don't get it.
Could some one please explain it in a little more detail, whats the difference between these 2 functions and why will the validations be performed in one and not in the other eagerly?
Conclusion/Solution:
I got confused due to my lack of understanding on which functions are determined to be iterator-generators. I assumed that, it is based on signature of a method like IEnumerable
<T>
. But, based on the answers, now I get it, a method is an iterator-generator if it uses yield statements.
The broken code is a single method, really an iterator-generator. That means it initially just returns a state machine without doing anything. Only when the calling code calls MoveNext (likely as part of a for-each loop) does it execute everything from the beginning up to the first yield-return.
With the correct code, Where
is not an iterator-generator. That means it executes everything immediately, like normal. Only WhereImpl
is. So the validation is executed right away, but the WhereImpl
code up to and including the first yield return is deferred.
So if you have something like:
IEnumerable<int> evens = list.Where(null); // Correct code gives error here.
foreach(int i in evens) // Broken code gives it here.
the broken version won't give you an error until you start iterating.