Search code examples
pythonfunctionreturngeneratoryield

Return or yield from a function that calls a generator?


I have a generator generator and also a convenience method to it - generate_all.

def generator(some_list):
  for i in some_list:
    yield do_something(i)

def generate_all():
  some_list = get_the_list()
  return generator(some_list) # <-- Is this supposed to be return or yield?

Should generate_all return or yield? I want the users of both methods to use it the same, i.e.

for x in generate_all()

should be equal to

some_list = get_the_list()
for x in generate(some_list)

Solution

  • Generators are lazy-evaluating so return or yield will behave differently when you're debugging your code or if an exception is thrown.

    With return any exception that happens in your generator won't know anything about generate_all, that's because when generator is really executed you have already left the generate_all function. With yield in there it will have generate_all in the traceback.

    def generator(some_list):
        for i in some_list:
            raise Exception('exception happened :-)')
            yield i
    
    def generate_all():
        some_list = [1,2,3]
        return generator(some_list)
    
    for item in generate_all():
        ...
    
    Exception                                 Traceback (most recent call last)
    <ipython-input-3-b19085eab3e1> in <module>
          8     return generator(some_list)
          9 
    ---> 10 for item in generate_all():
         11     ...
    
    <ipython-input-3-b19085eab3e1> in generator(some_list)
          1 def generator(some_list):
          2     for i in some_list:
    ----> 3         raise Exception('exception happened :-)')
          4         yield i
          5 
    
    Exception: exception happened :-)
    

    And if it's using yield from:

    def generate_all():
        some_list = [1,2,3]
        yield from generator(some_list)
    
    for item in generate_all():
        ...
    
    Exception                                 Traceback (most recent call last)
    <ipython-input-4-be322887df35> in <module>
          8     yield from generator(some_list)
          9 
    ---> 10 for item in generate_all():
         11     ...
    
    <ipython-input-4-be322887df35> in generate_all()
          6 def generate_all():
          7     some_list = [1,2,3]
    ----> 8     yield from generator(some_list)
          9 
         10 for item in generate_all():
    
    <ipython-input-4-be322887df35> in generator(some_list)
          1 def generator(some_list):
          2     for i in some_list:
    ----> 3         raise Exception('exception happened :-)')
          4         yield i
          5 
    
    Exception: exception happened :-)
    

    However this comes at the cost of performance. The additional generator layer does have some overhead. So return will be generally a bit faster than yield from ... (or for item in ...: yield item). In most cases this won't matter much, because whatever you do in the generator typically dominates the run-time so that the additional layer won't be noticeable.

    However yield has some additional advantages: You aren't restricted to a single iterable, you can also easily yield additional items:

    def generator(some_list):
        for i in some_list:
            yield i
    
    def generate_all():
        some_list = [1,2,3]
        yield 'start'
        yield from generator(some_list)
        yield 'end'
    
    for item in generate_all():
        print(item)
    
    start
    1
    2
    3
    end
    

    In your case the operations are quite simple and I don't know if it's even necessary to create multiple functions for this, one could easily just use the built-in map or a generator expression instead:

    map(do_something, get_the_list())          # map
    (do_something(i) for i in get_the_list())  # generator expression
    

    Both should be identical (except for some differences when exceptions happen) to use. And if they need a more descriptive name, then you could still wrap them in one function.

    There are multiple helpers that wrap very common operations on iterables built-in and further ones can be found in the built-in itertools module. In such simple cases I would simply resort to these and only for non-trivial cases write your own generators.

    But I assume your real code is more complicated so that may not be applicable but I thought it wouldn't be a complete answer without mentioning alternatives.