Search code examples
pythoniterable

Repeatedly reasigning variable pointing to iterable


Consider the following code:

import more_itertools as mo

def rep(x, n):
    for i in range(n):
        yield x

xs = [0]
for n in [1, 2, 3]:
    xs = mo.flatten(rep(x, n) for x in xs)

print(mo.ilen(xs))

The true answer should be 6, but it prints 27 why?

Note that more_itertools.flatten does the obvious thing and is actually an alias for itertools.chain.from_iterable. more_itertools.ilen also does the obvious thing and just counts the elements. I don't think that there are any bugs in the functions involved, just something about reasigning xs.


Solution

  • Your mo.flatten(...) calls are on generator expressions which are lazily evaluated, so the evaluation only occurs when the generator must be consumed on the final line by mo.ilen(xs). At this point, the variable n has the value 3, so that's the value of n used when evaluating the generator expressions which close over n. (Note that although you might think n only exists within the loop, it is still in scope after the loop because Python does not have block scope.)

    The result is that each of the three levels of nesting multiplies the length of the original sequence by 3, so that the final iterable has a length of 1×3×3×3 = 27 instead of 1×1×2×3 = 6. So this isn't really about xs being reassigned at all, it's about n being reassigned.

    To get the expected behaviour (without eager evaluation), you can wrap the generator expression in a function call in order to close over a different n in each generator expression:

    import more_itertools as mo
    
    def rep(x, n):
        for i in range(n):
            yield x
    
    def make_flatten(xs, n):
        # here n is not reassigned in the local scope
        return mo.flatten(rep(x, n) for x in xs)
    
    xs = [0]
    for n in [1, 2, 3]:
        xs = make_flatten(xs, n)
    
    print(mo.ilen(xs)) # prints 6, as expected