Search code examples
pythoniterator

Python + Iterator resulting from map sharing current state with the initial iterator


Let me illustrate this with an example we came across with my students :

>>>a_lot = (i for i in range(10e50))
>>>twice_a_lot = map(lambda x: 2*x, a_lot)
>>>next(a_lot)
0
>>>next(a_lot)
1
>>>next(a_lot)
2
>>>next(twice_a_lot)
6

So somehow these iterators share their current state, as crazy and unconfortable as it sounds... Any hints as of the model python uses behind the scene ?


Solution

  • This may be surprising at first but upon a little reflection, it should seem obvious.

    When you create an iterator from another iterator, there is no way to recover the original state over whatever underlying container you are iterating over (in this case, the range object). At least not in general.

    Consider the simplest case of this: iter(something).

    When something is an iterator, then according to the iterator protocol specification, iterator.__iter__ must:

    Return the iterator object itself

    In other words, if you've implemented the protocol correctly, then the following identity will always hold:

    iter(iterator) is iterator
    

    Of course, map could have some convention that would allow it to recover and create an independent iterator, but there is no such convention. In general, if you want to create independent iterators, you need to create it from the source.

    And of course, there are iterators where this really is not possible without storing all previous results. Consider:

    import random
    
    def random_iterator():
        while True:
            yield random.random()
    

    In which case, how should map function with the following?

    iterator = random_iterator()
    twice = map(lambda x: x*2, iterator)