Search code examples
pythonrecursiongenerator

Python nested generators


I was trying to implement the reverse function of itertools.izip on Python 2.7.1. The thing is that I find a problem, and I don't have an explantion. Solution 1, iunzip_v1 works perfectly. But solution 2. iunzip_v2, doesn't works as expected. Til now, I haven't found any relevant information about this problem, and reading the PEP about generators, it sound it should work, but it doesn't.

import itertools
from operator import itemgetter

def iunzip_v1(iterable):
    _tmp, iterable = itertools.tee(iterable, 2)
    iters = itertools.tee(iterable, len(_tmp.next()))
    return tuple(itertools.imap(itemgetter(i), it) for i, it in enumerate(iters))

def iunzip_v2(iterable):
    _tmp, iterable = itertools.tee(iterable, 2)
    iters = itertools.tee(iterable, len(_tmp.next()))
    return tuple((elem[i] for elem in it) for i, it in enumerate(iters))

result:

In [17]: l
Out[17]: [(0, 0, 0), (1, 2, 3), (2, 4, 6), (3, 6, 9), (4, 8, 12)]

In [18]: map(list, iunzip.iunzip_v1(l))
Out[18]: [[0, 1, 2, 3, 4], [0, 2, 4, 6, 8], [0, 3, 6, 9, 12]]

In [19]: map(list, iunzip.iunzip_v2(l))
Out[19]: [[0, 3, 6, 9, 12], [0, 3, 6, 9, 12], [0, 3, 6, 9, 12]]

Seems that iunzip_v2 is using the last value, so the generators aren't keeping the value while they are created inside the first generator. I'm missing something and I don't know what is.

Thanks in advance if something can clarify me this situation.

UPDATE: I've found the explanation here PEP-289, my first read was at PEP-255. The solution I'm trying to implement is a lazy one, so:

  zip(*iter) or izip(*...)

doesn't work for me, because *arg expand the argument list.


Solution

  • You're reinventing the wheel in a crazy way. izip is its own inverse:

    >>> list(izip(*izip(range(10), range(10))))
    [(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)]
    

    But that doesn't quite answer your question, does it?

    The problem with your nested generators is a scoping problem that happens because the innermost generators don't get used until the outermost generator has already run:

    def iunzip_v2(iterable):
        _tmp, iterable = itertools.tee(iterable, 2)
        iters = itertools.tee(iterable, len(_tmp.next()))
        return tuple((elem[i] for elem in it) for i, it in enumerate(iters))
    

    Here, you generate three generators, each of which uses the same variable, i. Copies of this variable are not made. Then, tuple exhausts the outermost generator, creating a tuple of generators:

    >>> iunzip_v2((range(3), range(3)))
    (<generator object <genexpr> at 0x1004d4a50>, <generator object <genexpr> at 0x1004d4aa0>, <generator object <genexpr> at 0x1004d4af0>)
    

    At this point, each of these generators will execute elem[i] for each element of it. And since i is now equal to 3 for all three generators, you get the last element each time.

    The reason the first version works is that itemgetter(i) is a closure, with its own scope -- so every time it returns a function, it generates a new scope, within which the value of i does not change.