Search code examples
c#linq.net-coreside-effects

Side effects executed less than expected (LINQ)


Here is a test using NUnit style:

[Test]
public void Test()
{
    var i = 0;
    var v = Enumerable.Repeat(0, 10).Select(x =>
    {
        i++;
        return x;
    }).Last();

    Assert.That(v, Is.EqualTo(0));
    Assert.That(i, Is.EqualTo(10));
}

Unexpectedly, it fails:

Message:
Expected: 10
But was:  1

It is a big surprise that the side effect to increase i only performed once, not ten times. So I try to replace those LINQ methods with my own intuitive/naive implementation:

private T MyLast<T>(IEnumerable<T> values)
{
    var enumerator = values.GetEnumerator();
    while (enumerator.MoveNext())
    {
    }
    return enumerator.Current;
}

private IEnumerable<T> MyRepeat<T>(T value, int count)
{
    for(var i = 0; i<count; ++i)
    {
        yield return value;
    }
}

I omit the changed code; but you can verify that, if the code use MyRepeat rather than Enumerable.Repeat, OR use MyLast rather than Enumerable.Last, the test is passing. So obviously both methods are implemented differently.

(The above were tested in .NET 6, but the original observation was in a piece of code using .NET Core 3.1)

So my question is: how LINQ implemented those methods in a way that causes such weird behavior?


Solution

  • The .NET Core LINQ implementation includes various optimizations for known types of enumerable sequences. So some operators like the Last and the ElementAt might use a shorter path to return the result, instead of enumerating the sequence element by element. For example queries for the List<T> can be optimized, because this collection offers indexed access to its elements, while the Queue<T> does not. Apparently sequences produced by the Enumerable.Repeat and the Enumerable.Range can be optimized too. Here is another example that reproduces your observations:

    Test(new List<int>(Enumerable.Range(0, 10)));
    Test(new Queue<int>(Enumerable.Range(0, 10)));
    Test(Enumerable.Range(0, 10));
    
    static void Test(IEnumerable<int> source)
    {
        int iterations = 0;
        int result = source.Select(x => { iterations++; return x; }).ElementAt(5);
        Console.WriteLine($"{source}, result: {result}, Iterations: {iterations}");
    }
    

    Output

    System.Collections.Generic.List`1[System.Int32], result: 5, Iterations: 1
    System.Collections.Generic.Queue`1[System.Int32], result: 5, Iterations: 6
    System.Linq.Enumerable+RangeIterator, result: 5, Iterations: 1
    

    Online demo.

    The performance optimizations for the Enumerable.Repeat are located in this source code file: