Search code examples
pythonclosureslist-comprehensiongeneratorsemantics

Closure semantics in Python generator comprehensions


In the following code I get ([1], [1]) for l1 but ([0], [1]) for l2, l3, l4. Which I find surprising, especially the difference between using t in the in clause (l3, l4), where it makes no difference whether it's a generator comprehension or a list comprehension, vs in the if clause (l1, l2), where it does.

l1 = [(x for x in (0, 1) if x == t) for t in (0, 1)]
l2 = [[x for x in (0, 1) if x == t] for t in (0, 1)]
l3 = [(x for x in [t]) for t in (0, 1)]
l4 = [[x for x in [t]] for t in (0, 1)]
print([(*map(list, l),) for l in (l1, l2, l3, l4)])

May I ask you for a detailed explanation of the rules governing such expressions? A link to relevant documentation? A rationale?


Solution

  • When you execute

    l1 = [(x for x in (0, 1) if x == t) for t in (0, 1)]
    

    l1 is a list of generators, each one holding a reference to the same "captured" variable t. When l1[0] is created, the value of t is 0, but the generator is not yet evaluated. When l1[1] is created, the value of t is 1, and is not modified afterwards. you can check this using:

    c1 = l1[0]
    c2 = l1[1]
    print(c1.gi_frame.f_locals)
    print(c2.gi_frame.f_locals)
    print(c1.gi_frame.f_locals['t'] is c2.gi_frame.f_locals['t'])
    

    As for why this does not happen in l3, I understand that in order to build the generators, their limits must be evaluated, so [t] must be evaluated and a list created at the moment of creating the generator. A modified script that helps to understand what happens follows (the original code commented out to facilitate the comparison):

    def testequal(x, t):
        print(f"Called with x={x}, t={t}")
        return x == t
    
    def generate_limits():
        print("Creating limits")
        return (0, 1)
    
    def generate_list(t):
        print(f"creating list with t={t}")
        rv = [t,]
        return rv
    
    print("Creating l1...")
    # l1 = [(x for x in (0, 1) if x == t) for t in (0, 1)]
    l1 = [(x for x in generate_limits() if testequal(x,t)) for t in (0, 1)]
    print("Creating l2...")
    l2 = [[x for x in (0, 1) if x == t] for t in (0, 1)]
    print("Creating l3...")
    # l3 = [(x for x in [t]) for t in (0, 1)]
    l3 = [(x for x in generate_list(t)) for t in (0, 1)]
    print("Creating l3...")
    l4 = [[x for x in [t]] for t in (0, 1)]
    print("Evaluating..." )
    print([(*map(list, l),) for l in (l1, l2, l3, l4)])