Search code examples
python-3.xgeneratoryield

Calling a function, that yields, twice


Working with python3, I had a requirement:

  • Perform some pre-work
  • Do the core work
  • Cleanup the pre-work

Taking inspiration from fixtures in pytest I came across this post and wrote some crazy code.

Though this crazy code works, I wish to understand the yield sorcery that makes it working :)

def db_connect_n_clean():
  db_connectors = []
  def _inner(db_obj):
    db_connectors.append(db_obj)
    print("Connect : ", db_obj)
  yield _inner
  for conn in db_connectors:
    print("Dispose : ", conn)

This is the driver code:

pre_worker = db_connect_n_clean()
freaky_function = next(pre_worker)
freaky_function("1")
freaky_function("2")
try:
  next(pre_worker)
except:
  pass

It produces this output:

Connect :  1
Connect :  2
Dispose :  1
Dispose :  2
Traceback (most recent call last):
  File "junk.py", line 81, in <module>
    next(pre_worker)
StopIteration

What confuses me in this code is, that all the calls to the same generator freaky_func is maintaining a single list of db_connectors

After the first yield, all the objects are disposed and I hit StopIteration

I was thinking that calling freaky_func twice would maintain 2 separate lists and there would be 2 separate yields

Update: The goal of this question is not to understand how to achieve this. As it is evident from the comments, context-manager is the way to go. But my question is to understand how this piece of code is working. Basically, the python side of it.


Solution

  • One of my favorite tools to visualize Python with is PythonTutor.

    Basically, you can see that on the first run next(pre_worker) returns the _inner function. Since _inner is inside db_connect_n_clean, it can access all of its variables.

    Internally, in Python, _inner contains a reference to db_connectors. You can see the reference under __closure__:

    >>> gen = db_connect_n_clean()
    >>> inner = next(gen)
    >>> inner.__closure__
    (<cell at 0x000001B73FE6A3E0: list object at 0x000001B73FE87240>,)
    >>> inner.__closure__[0].cell_contents
    []
    

    The name of the reference is the same as the variable:

    >>> inner.__code__.co_freevars
    ('db_connectors',)
    

    Every time this specific function, with this specific __closure__ tries to access the db_connectors, it goes to the same list.

    >>> inner(1)
    Connect :  1
    >>> inner(2)
    Connect :  2
    >>> inner.__closure__[0].cell_contents
    [1, 2]
    

    The original generator gen() is still paused at the first yield:

    >>> gen.gi_frame.f_lineno
    6  # Gen is stopped at line #6
    >>> gen.gi_frame.f_locals["db_connectors"]
    [1, 2]
    

    When you advance it again using next() it continues on from the yield and closes everything:

    >>> next(gen)
    Dispose :  1
    Dispose :  2
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    StopIteration
    

    If you wish to understand how do generators work in general, there are plenty of answers and articles on the subject. I wrote this one for example.

    If I didn't fully explain the situation, feel free to ask for clarification in the comments!