Search code examples
c#foreachgeneratorenumerator

C# "Generator" Method


I come from the world of Python and am trying to create a "generator" method in C#. I'm parsing a file in chunks of a specific buffer size, and only want to read and store the next chunk one at a time and yield it in a foreach loop. Here's what I have so far (simplified proof of concept):

class Page
{
    public uint StartOffset { get; set; }
    private uint currentOffset = 0;

    public Page(MyClass c, uint pageNumber)
    {
        uint StartOffset = pageNumber * c.myPageSize;

        if (StartOffset < c.myLength)
            currentOffset = StartOffset;
        else
            throw new ArgumentOutOfRangeException("Page offset exceeds end of file");

        while (currentOffset < c.myLength && currentOffset < (StartOffset + c.myPageSize))
            // read data from page and populate members (not shown for MWE purposes)
            . . .
    }
}

class MyClass
{
    public uint myLength { get; set; }
    public uint myPageSize { get; set; }

    public IEnumerator<Page> GetEnumerator()
    {
        for (uint i = 1; i < this.myLength; i++)
        {
            // start count at 1 to skip first page
            Page p = new Page(this, i);
            try
            {
                yield return p;
            }
            catch (ArgumentOutOfRangeException)
            {
                // end of available pages, how to signal calling foreach loop?
            }
        }
    }
}

I know this is not perfect since it is a minimum working example (I don't allow many of these properties to be set publicly, but for keeping this simple I don't want to type private members and properties).

However, my main question is how do I let the caller looping over MyClass with a foreach statement know that there are no more items left to loop through? Is there an exception I throw to indicate there are no elements left?


Solution

  • As mentioned in the comments, you should use IEnumerable<T> instead of IEnumerator<T>. The enumerator is the technical object that is being used to enumerate over something. That something—in many cases–is an enumerable.

    C# has special abilities to deal with enumerables. Most prominently, you can use a foreach loop with an enumerable (but not an enumerator; even though the loop actually uses the enumerator of the enumerable). Also, enumerables allow you to use LINQ which makes it even more easier to consume.

    So you should change your class like this:

    class MyClass
    {
        public uint myLength { get; set; }
        public uint myPageSize { get; set; }
    
        # note the modified signature
        public IEnumerable<Page> GetPages()
        {
            for (uint i = 1; i < this.myLength; i++)
            {
                Page p;
                try
                {
                    p = new Page(this, i);
                }
                catch (ArgumentOutOfRangeException)
                {
                    yield break;
                }
                yield return p;
            }
        }
    }
    

    In the end, this allows you to use it like this:

    var obj = new MyClass();
    
    foreach (var page in obj.GetPages())
    {
        // do whatever
    }
    
    // or even using LINQ
    var pageOffsets = obj.GetPages().Select(p => p.currentOffset).ToList();
    

    Of course, you should also change the name of the method to something meaningful. If you’re returning pages, GetPages is maybe a good first step in the right direction. The name GetEnumerator is kind of reserved for types implementing IEnumerable, where the GetEnumerator method is supposed to return an enumerator of the collection the object represents.