Search code examples
pythonloopsiterator

Unexpected behavior when copying iterators using tee


If you copy an iterator inside a for loop, the iteration resumes just fine. For example:

ita = iter(range(5))
for a in ita:
    print(a)
    if a == 2:
        ita, itb = tee(ita)

prints 0 1 2 3 4. However, if you iterate over the second copy made, the original iterator depletes as well:

ita = iter(range(5))
for a in ita:
    print(a)
    if a == 2:
        ita, itb = tee(ita)
        for b in itb:
            pass

only prints 0 1 2.

As far as I understand it, iterating over the copied iterator shouldn't affect the original one, so I don't know why this is happening. Any help would be appreciated


Solution

  • tee creates two iterators from one iterator. Each of those two created iterators can be used independently, i.e. consuming one of them does not consume th other one.

    However, the original iterator has to be consumed. For example:

    a = iter(range(5))
    b, c = tee(a)
    for value in b:
        pass
    

    At this point, b is consumed. By consuming b, a has been consumed as well, but c has not been consumed. See:

    >>> list(a)
    []
    >>> list(b)
    []
    >>> list(c)
    [0, 1, 2, 3, 4]
    

    Now, in the original code, the same variable name is use for two things:

    ita, itb = tee(ita)
    

    There are 3 iterators here, but two of them are using variable name ita. That is the cause of the confusion. The new ita iterator does not get consumed together with itb, but the old one does.

    Note that the original ita is used in for a in ita:, and not the new one. That is because it ita variable was read only the first time the for line was executed and then assigning something else to ita does not affect the loop.

    You can see that the new ita is not consumed by adding this one line:

    ita = iter(range(5))
    for a in ita:
        print(a)
        if a == 2:
            ita, itb = tee(ita)
            for b in itb:
                pass
            print(list(ita))  # dded line
    

    Prints:

    0
    1
    2
    [3, 4]