I'm reading a CSV file using the with
-block in Python. This emulates a file being opened (__enter__
), the lines of the file being generated on demand (yield (...)
) and the file being closed (__exit__
).
@contextlib.contextmanager
def open_file():
print('__enter__')
yield (f'line {i}' for i in (1, 2, 3))
print('__exit__')
I want to read the lines of this CSV and want to return a model for each row. So I'm doing something like that:
class Model:
def __init__(self, value: int):
self.value = value
def create_models():
with open_file() as f:
for line in f:
yield Model(int(line.split(' ')[-1]))
If I run through the whole file, it will call both __enter__
and __exit__
callbacks:
def main1():
for model in create_models():
print(model.value)
main1()
# __enter__
# 1
# 2
# 3
# __exit__
However, if I don't need all models from the file, it'll not call the __exit__
callback:
def main2():
for model in create_models():
print(model.value)
if model.value % 2 == 0:
break
main2()
# __enter__
# 1
# 2
How can I implement a context manager that exits even if I don't iterate over the whole child generator? Is it even possible?
Note that, in this example, I can't change the open_file
context manager, since in my code it's the open
built-in from Python (I used open_file
here just to be a minimal and reproducible example).
What I want to do is something similar to Stream
s in Dart:
import 'dart:convert';
class Model {
final int value;
const Model(this.value);
}
void main() {
File('file.csv')
.openRead()
.transform(utf8.decoder)
.transform(const LineSplitter())
.map((line) => int.parse(line.split(',')[1]))
.map((value) => Model(value))
.takeWhile((model) => model.value % 2 == 0)
.forEach((model) {
print(model.value);
});
}
This would close the file successfully, even if I don't consume all lines from it (takeWhile
). Since my Python codebase is all synchronous code, I'd like to avoid introducing async
and await
, because it would take too long to refactor.
To ensure that '__exit__'
is printed, change open_file
to:
@contextlib.contextmanager
def open_file():
print('__enter__')
try:
yield (f'line {i}' for i in (1, 2, 3))
finally:
print('__exit__')
The reason is that break
causes the generator to raise a GeneratorExit
exception which makes open_file
skip print('__exit__')
unless you handle it or add a finally
.
This common pattern of try: yield resource; finally: cleanup
is shown in the first example of the contextlib.contextmanager
decorator.