Search code examples
pythongeneratorcontextmanager

How can I make a context manager returning a generator exit successfully when not fully consuming the generator?


I'm reading a CSV file using the with-block in Python. This emulates a file being opened (__enter__), the lines of the file being generated on demand (yield (...)) and the file being closed (__exit__).

@contextlib.contextmanager
def open_file():
    print('__enter__')
    yield (f'line {i}' for i in (1, 2, 3))
    print('__exit__')

I want to read the lines of this CSV and want to return a model for each row. So I'm doing something like that:

class Model:
    def __init__(self, value: int):
        self.value = value

def create_models():
    with open_file() as f:
        for line in f:
            yield Model(int(line.split(' ')[-1]))

If I run through the whole file, it will call both __enter__ and __exit__ callbacks:

def main1():
    for model in create_models():
        print(model.value)

main1()
# __enter__
# 1
# 2
# 3
# __exit__

However, if I don't need all models from the file, it'll not call the __exit__ callback:

def main2():
    for model in create_models():
        print(model.value)
        if model.value % 2 == 0:
            break

main2()
# __enter__
# 1
# 2

How can I implement a context manager that exits even if I don't iterate over the whole child generator? Is it even possible?

Note that, in this example, I can't change the open_file context manager, since in my code it's the open built-in from Python (I used open_file here just to be a minimal and reproducible example).

What I want to do is something similar to Streams in Dart:

import 'dart:convert';

class Model {
  final int value;

  const Model(this.value);
}

void main() {
  File('file.csv')
    .openRead()
    .transform(utf8.decoder)
    .transform(const LineSplitter())
    .map((line) => int.parse(line.split(',')[1]))
    .map((value) => Model(value))
    .takeWhile((model) => model.value % 2 == 0)
    .forEach((model) {
      print(model.value);
    });
}

This would close the file successfully, even if I don't consume all lines from it (takeWhile). Since my Python codebase is all synchronous code, I'd like to avoid introducing async and await, because it would take too long to refactor.


Solution

  • To ensure that '__exit__' is printed, change open_file to:

    @contextlib.contextmanager
    def open_file():
        print('__enter__')
        try:
            yield (f'line {i}' for i in (1, 2, 3))
        finally:
            print('__exit__')
    

    The reason is that break causes the generator to raise a GeneratorExit exception which makes open_file skip print('__exit__') unless you handle it or add a finally.

    This common pattern of try: yield resource; finally: cleanup is shown in the first example of the contextlib.contextmanager decorator.