Search code examples
c#iteratoryield-returndeferred-execution

How is transforming this iterator block a functional change?


Given the following code snippet:

public class Foo
{
    public IEnumerable<string> Sequence { get; set; }
    public IEnumerable<string> Bar()
    {
        foreach (string s in Sequence)
            yield return s;
    }
}

is the following snippet semantically equivalent, or is it different? If it is different, how do they function differently?

public class Foo2
{
    public IEnumerable<string> Sequence { get; set; }
    public IEnumerable<string> Bar2()
    {
        return Sequence;
    }
}

This question is inspired by this question which is asking a different question about a similar situation.


Solution

  • The two are not equivalent. The semantics of how execution is deferred between the two Bar methods is different. Foo.Bar will evaluate Sequence into an IEnumerable value when you call Bar. Foo2.Bar2 will evaluate Sequence into the value in that variable when you enumerate the sequence returned by Bar2.

    We can write a simple enough program to observe the differences here.

    //Using iterator block
    var foo = new Foo();
    foo.Sequence = new[] { "Old" };
    var query = foo.Bar();
    foo.Sequence = new[] { "New" };
    Console.WriteLine(string.Join(" ", query));
    
    //Not using iterator block
    var foo2 = new Foo2();
    foo2.Sequence = new[] { "Old" };
    var query2 = foo2.Bar2();
    foo2.Sequence = new[] { "New" };
    Console.WriteLine(string.Join(" ", query2));
    

    This prints out:

    New
    Old

    In this particular case our Bar method also has no side effects. If it did it would not be noticeably more important to understand the semantics that your program has, and what it should have. For example, let's modify the two methods so that they have some observable side effects:

    public class Foo
    {
        public IEnumerable<string> Sequence { get; set; }
        public IEnumerable<string> IteratorBlock()
        {
            Console.WriteLine("I'm iterating Sequence in an iterator block");
            foreach (string s in Sequence)
                yield return s;
        }
        public IEnumerable<string> NoIteratorBlock()
        {
            Console.WriteLine("I'm iterating Sequence without an iterator block");
            return Sequence;
        }
    }
    

    Now let's try comparing these two methods to see how they function:

    var query = foo.IteratorBlock();
    var query2 = foo.NoIteratorBlock();
    Console.WriteLine("---");
    query.Count();
    query.Count();
    query2.Count();
    query2.Count();
    

    This will print out:

    I'm iterating Sequence without an iterator block
    ---
    I'm iterating Sequence in an iterator block
    I'm iterating Sequence in an iterator block

    Here we can see that the non-iterator block's side effects happen when the method itself is called, and the iterator block's side effects don't happen at that point in time. Then, later on, each time we iterate the non-iterator block it doesn't cause the side effects at all, but the iterator block causes the side effects each time the query is iterated.