Search code examples
pythongeneratorcoroutineyieldyield-from

Avoiding extra `next` call after `yield from` in Python generator


Please see the below snippet, run with Python 3.10:

from collections.abc import Generator

DUMP_DATA = 5, 6, 7

class DumpData(Exception):
    """Exception used to indicate to yield from DUMP_DATA."""

def sample_gen() -> Generator[int | None, int, None]:
    out_value: int | None = None
    while True:
        try:
            in_value = yield out_value
        except DumpData:
            yield len(DUMP_DATA)
            yield from DUMP_DATA
            out_value = None
            continue
        out_value = in_value

My question pertains to the DumpData path where there is a yield from. After that yield from, there needs to be a next(g) call, to bring the generator back to the main yield statement so we can send:

def main() -> None:
    g = sample_gen()
    next(g)  # Initialize
    assert g.send(1) == 1
    assert g.send(2) == 2

    # Okay let's dump the data
    num_data = g.throw(DumpData)
    data = tuple(next(g) for _ in range(num_data))
    assert data == DUMP_DATA

    # How can one avoid this `next` call, before it works again?
    next(g)
    assert g.send(3) == 3

How can this extra next call be avoided?


Solution

  • When you yield from a tuple directly, the built-in tuple_iterator (which sample_gen delegates to) handles an additional "final value" yield before it terminates. It does not have a send method (unlike generators in general) and returns a final value None to sample_gen.

    The behavior:

    yield from DUMP_DATA  # is equivalent to:
    yield from tuple_iterator(DUMP_DATA)
    
    def tuple_iterator(t):
        for item in t:
            yield item
        return None
    

    You can implement tuple_iterator_generator, with usage:

    try:
        in_value = yield out_value
    except DumpData:
        yield len(DUMP_DATA)
        in_value = yield from tuple_iterator_generator(DUMP_DATA)
    out_value = in_value
    
    def tuple_iterator_generator(t):
        in_value = None
        for item in t:
            in_value = yield item
        return in_value
    

    Or just not use yield from if you don't want that behavior:

    try:
        in_value = yield out_value
    except DumpData:
        yield len(DUMP_DATA)
        for out_value in DUMP_DATA:
            in_value = yield out_value
    out_value = in_value
    

    See https://docs.python.org/3/whatsnew/3.3.html#pep-380-syntax-for-delegating-to-a-subgenerator for a use case of that behavior.