Search code examples
pythongeneratoryieldgenerator-expression

Generator expressions vs yield


I was watching this video (http://pyvideo.org/video/1758/loop-like-a-native-while-for-iterators-genera) where at the end he talks about how the generator expression is the same as the normal generator way, but that doesn't seem to be the case and other topics that I've read that is generator expression vs yield say there is no difference. However from what I can see using yield will yield back to the for loop each time where the generator expression doesn't. It completes it's task and then goes back to the for loop. That's could be a fairly big difference in memory usage (depending on what you're looping over) right? Am I right in my thinking?

# generators call yield which will return to the loop it's called in before coming back
def evens(stream):
    for n in stream:
        if n % 2 == 0:
            print("Inside evens")
            yield n

# this is the same as above just a generator expression
def evens2(stream):
    print("Inside evens2")
    return (n for n in stream if n % 2 == 0)

Solution

  • You are wrong in your thinking. Your generator expression does exactly the same thing as the generator function, with only one difference: you placed the print() call in the wrong place. In evens2 you print before the generator expression has been executed, creating a generator object, while in evens you print inside the generator function itself.

    If this is Python 3 (or you used from __future__ import print_function) you could use the print() function inside the generator expression too:

    def evens2(stream):
        return (print('inside evens2') or n for n in stream if n % 2 == 0)
    

    This is the equivalent of:

    def evens(stream):
        for n in stream:
            if n % 2 == 0:
                yield print("Inside evens") or n
    

    print() always returns None, so print(..) or n will return n. Iteration over either will both print and yield all even n values.

    Demo:

    >>> def evens2(stream):
    ...     return (print('inside evens2') or n for n in stream if n % 2 == 0)
    ...
    >>> def evens(stream):
    ...     for n in stream:
    ...         if n % 2 == 0:
    ...             yield print("Inside evens") or n
    ...
    >>> g1 = evens([1, 2, 3, 4, 5])
    >>> g2 = evens2([1, 2, 3, 4, 5])
    >>> g1
    <generator object evens at 0x10bbf5938>
    >>> g2
    <generator object evens2.<locals>.<genexpr> at 0x10bbf5570>
    >>> next(g1)
    Inside evens
    2
    >>> next(g2)
    inside evens2
    2
    >>> next(g1)
    Inside evens
    4
    >>> next(g2)
    inside evens2
    4
    

    Both calls produce a generator object, and both generator objects print additional information each time you advance them one step with next().

    As far as Python is concerned, the two generator objects produce more or less the same bytecode:

    >>> import dis
    >>> dis.dis(compile('(n for n in stream if n % 2 == 0)', '', 'exec').co_consts[0])
      1           0 LOAD_FAST                0 (.0)
            >>    3 FOR_ITER                27 (to 33)
                  6 STORE_FAST               1 (n)
                  9 LOAD_FAST                1 (n)
                 12 LOAD_CONST               0 (2)
                 15 BINARY_MODULO
                 16 LOAD_CONST               1 (0)
                 19 COMPARE_OP               2 (==)
                 22 POP_JUMP_IF_FALSE        3
                 25 LOAD_FAST                1 (n)
                 28 YIELD_VALUE
                 29 POP_TOP
                 30 JUMP_ABSOLUTE            3
            >>   33 LOAD_CONST               2 (None)
                 36 RETURN_VALUE
    >>> dis.dis(compile('''\
    ... def evens(stream):
    ...     for n in stream:
    ...         if n % 2 == 0:
    ...             yield n
    ... ''', '', 'exec').co_consts[0])
      2           0 SETUP_LOOP              35 (to 38)
                  3 LOAD_FAST                0 (stream)
                  6 GET_ITER
            >>    7 FOR_ITER                27 (to 37)
                 10 STORE_FAST               1 (n)
    
      3          13 LOAD_FAST                1 (n)
                 16 LOAD_CONST               1 (2)
                 19 BINARY_MODULO
                 20 LOAD_CONST               2 (0)
                 23 COMPARE_OP               2 (==)
                 26 POP_JUMP_IF_FALSE        7
    
      4          29 LOAD_FAST                1 (n)
                 32 YIELD_VALUE
                 33 POP_TOP
                 34 JUMP_ABSOLUTE            7
            >>   37 POP_BLOCK
            >>   38 LOAD_CONST               0 (None)
                 41 RETURN_VALUE
    

    Both use FOR_ITER to loop, COMPARE_OP to see if the output of BINARY_MODULO is equal to 0 and both use YIELD_VALUE to yield the value of n.