Search code examples
pythoniteratorlazy-evaluation

In python, can I lazily generate copies of an iterator using tee?


I'm trying to create an iterator which lazily creates (potentially infinitely many) copies of an iterator. Is this possible?

I know I can create any fixed finite number of copies by simply doing

from itertools import tee
iter_copies = tee(my_iter, n=10)

but this breaks down if you don't know n ahead of time or if n is infinite.

I would usually try something along the lines of

from itertools import tee

def inf_tee(my_iter):
    while True:
        yield tee(my_iter)[1]

But the documentation states that after using tee on an iterator the original iterator can no longer be used, so this won't work.


In case you're interested in the application: the idea is to create a lazy unzip function, potentially for use in pytoolz. My current implementation can handle a finite number of infinite iterators (which is better than plain zip(*seq)), but not an infinite number of infinite iterators. Here's the pull request if you're interested in the details.


Solution

  • This is only barely touched upon in a single example near the bottom of the Python 2 itertools documentation, but itertools.tee supports copying:

    import itertools, copy
    
    def infinite_copies(some_iterable):
        master, copy1 = itertools.tee(some_iterable)
        yield copy1
        while True:
            yield copy.copy(master)
    

    The example in the documentation actually uses the __copy__ magic method, which is the hook used to customize copy.copy behavior. (Apparently tee.__copy__ was added as part of a copyable iterators project that didn't go anywhere.)

    Note that this will require storing every element ever produced by the original iterator, which can get very expensive. There is no way to avoid this cost.