Search code examples
rubyyieldsequences

Ruby equivalent of C#'s 'yield' keyword, or, creating sequences without preallocating memory


In C#, you could do something like this:

public IEnumerable<T> GetItems<T>()
{
    for (int i=0; i<10000000; i++) {
        yield return i;
    }
}

This returns an enumerable sequence of 10 million integers without ever allocating a collection in memory of that length.

Is there a way of doing an equivalent thing in Ruby? The specific example I am trying to deal with is the flattening of a rectangular array into a sequence of values to be enumerated. The return value does not have to be an Array or Set, but rather some kind of sequence that can only be iterated/enumerated in order, not by index. Consequently, the entire sequence need not be allocated in memory concurrently. In .NET, this is IEnumerable and IEnumerable<T>.

Any clarification on the terminology used here in the Ruby world would be helpful, as I am more familiar with .NET terminology.

EDIT

Perhaps my original question wasn't really clear enough -- I think the fact that yield has very different meanings in C# and Ruby is the cause of confusion here.

I don't want a solution that requires my method to use a block. I want a solution that has an actual return value. A return value allows convenient processing of the sequence (filtering, projection, concatenation, zipping, etc).

Here's a simple example of how I might use get_items:

things = obj.get_items.select { |i| !i.thing.nil? }.map { |i| i.thing }

In C#, any method returning IEnumerable that uses a yield return causes the compiler to generate a finite state machine behind the scenes that caters for this behaviour. I suspect something similar could be achieved using Ruby's continuations, but I haven't seen an example and am not quite clear myself on how this would be done.

It does indeed seem possible that I might use Enumerable to achieve this. A simple solution would be to us an Array (which includes module Enumerable), but I do not want to create an intermediate collection with N items in memory when it's possible to just provide them lazily and avoid any memory spike at all.

If this still doesn't make sense, then consider the above code example. get_items returns an enumeration, upon which select is called. What is passed to select is an instance that knows how to provide the next item in the sequence whenever it is needed. Importantly, the whole collection of items hasn't been calculated yet. Only when select needs an item will it ask for it, and the latent code in get_items will kick into action and provide it. This laziness carries along the chain, such that select only draws the next item from the sequence when map asks for it. As such, a long chain of operations can be performed on one data item at a time. In fact, code structured in this way can even process an infinite sequence of values without any kinds of memory errors.

So, this kind of laziness is easily coded in C#, and I don't know how to do it in Ruby.

I hope that's clearer (I'll try to avoid writing questions at 3AM in future.)


Solution

  • It's supported by Enumerator since Ruby 1.9 (and back-ported to 1.8.7). See Generator: Ruby.

    Cliche example:

    fib = Enumerator.new do |y|
      y.yield i = 0
      y.yield j = 1
      while true
        k = i + j
        y.yield k
        i = j
        j = k
      end
    end
    
    100.times { puts fib.next() }