I've been playing with Lists and Enumerables and I think I understand the basics:
I've done some tests:
Starting with the Enumerable example:
var myList = new List<int>() { 1, 2, 3, 4, 5, 6 };
var myEnumerable = myList.Where(p =>
{
Console.Write($"{p} ");
return p > 2;
}
);
Console.WriteLine("");
Console.WriteLine("Starting");
myEnumerable.First();
Console.WriteLine("");
myEnumerable.Skip(1).First();
The output is:
Starting
1 2 3
1 2 3 4
If we add .ToList()
after the .Where(...)
then the output is:
1 2 3 4 5 6
Starting
I also was able to have a bit of both worlds with this class:
class SingleEvaluationEnum<T>
{
private IEnumerable<T> Enumerable;
public SingleEvaluationEnum(IEnumerable<T> enumerable)
=> Enumerable = enumerable;
public IEnumerable<T> Get()
{
if (!(Enumerable is List<T>))
Enumerable = Enumerable.ToList().AsEnumerable();
return Enumerable;
}
}
You can see the output is:
Starting
1 2 3 4 5 6
This way the evaluation is deferred until the first consumption and is not re-evaluated in the next ones. But the whole list is evaluated.
My question is: Is there a way to get this output?
Starting
1 2 3
4
In other words: I want myEnumerable.First()
to evaluate only the necesary elements, but no more. And I want myEnumerable.Skip(1).First()
to reuse the already evaluated elements.
EDIT: Clarification: I want that any "query" over the Enumerable applies to all the elements in the list. That's why (AFAIK) an Enumerator doesn't work.
Thanks!
LINQ is fundamentally a functional approach to working with collections. One of the assumptions is that there are no side-effects to evaluating the functions. You're violating that assumption by calling Console.Write
in the function.
There's no magic involved, just functions. IEnumerable
has just one method - GetEnumerator
. That's all that is needed for LINQ, and that's all that LINQ really does. For example, a naïve implementation of Where
would look like this:
public static IEnumerable<T> Where<T>(this IEnumerable<T> @this, Func<T, bool> filter)
{
foreach (var item in @this)
{
if (filter(item)) yield return item;
}
}
A Skip
might look like this:
public static IEnumerable<T> Skip<T>(this IEnumerable<T> @this, int skip)
{
foreach (var item in @this)
{
if (skip-- > 0) continue;
yield return item;
}
}
That's all there is to it. It doesn't have any information about what IEnumerable
is or represents. In fact, that's the whole point - you're abstracting those details away. There are a few optimizations in those methods, but they don't do anything smart. In the end, the difference between the List
and IEnumerable
in your example isn't anything fundamental - it's that myEnumerable.Skip(1)
has side-effects (because myEnumerable
itself has side-effects) while myList.Skip(1)
doesn't. But both do the exact same thing - evaluate the enumerable, item by item. There's no other method than GetEnumerator
on an enumerable, and IEnumerator
only has Current
and MoveNext
(of those that matter for us).
LINQ is immutable. That's one of the reasons why it's so useful. This allows you to do exactly what you're doing - query the same enumerable twice but getting the exact same result. But you're not happy with that. You want things to be mutable. Well, nothing is stopping you from making your own helper functions. LINQ is just a bunch of functions, after all - you can make your own.
One such simple extension could be a memoized enumerable. Wrap around the source enumerable, create a list internally, and when you iterate over the source enumerable, keep adding items to the list. The next time GetEnumerator
is called, start iterating over your internal list. When you reach the end, continue with the original approach - iterate over the source enumerable and keep adding to the list.
This will allow you to use LINQ fully, just inserting Memoize()
to your LINQ queries at the places where you want to avoid iterating over the source multiple times. In your example, this would be something like:
myEnumerable = myEnumerable.Memoize();
Console.WriteLine("");
Console.WriteLine("Starting");
myEnumerable.First();
Console.WriteLine("");
myEnumerable.Skip(1).First();
The first call to myEnumerable.First()
will iterate through the first three items in myList
, and the second will only work with the fourth.