These three memory or disk buffer follow the same access pattern. I'm going to focus BytesIO
.
How do I pass in a file or buffer object to be used later? I'm having a lot of trouble with the following use case:
def get_file_and_metadata():
metadata = {"foo": "bar"}
with io.BytesIO() as f:
f.write(b'content')
f.seek(0)
return f, metadata
f, metadata = get_file_and_metadata()
# Do something with file
pd.read_csv(f, encoding="utf-8")
I suspect is because f.close()
is ran after return statement.
close
is run when the with
suite terminates. If you want to pass back an open file-like object, you should not open it in a with
. One option is to just drop the context manager completely and leave it up to the caller to clean up the object.
def get_file_and_metadata():
metadata = {"foo": "bar"}
f = o.BytesIO()
f.write(b'content')
f.seek(0)
return f, metadata
f, metadata = get_file_and_attr()
try:
# Do something with file
pd.read_csv(f, encoding="utf-8")
finally:
f.close()
This is a reasonable thing to do any time a file object is passed through some sort of pipeline or used in an order that makes context managers inconvenient. 99% of the time files are closed when the object deleted anyway, at least in cpython.
Or you could write your own context manager
import contextlib
@contextlib.contextmanager
def get_file_and_metadata():
metadata = {"foo": "bar"}
f = o.BytesIO()
f.write(b'content')
f.seek(0)
try:
yield f, metadata
finally:
f.close()
with get_file_and_attr() as f, metadata:
# Do something with file
pd.read_csv(f, encoding="utf-8")
From your comment I realized that the metadata could just go on the BytesIO object and then its context manager is available.
import io
def get_file_and_metadata():
metadata = {"foo": "bar"}
f = io.BytesIO()
f.write(b'content')
f.seek(0)
f.metadata = metadata
return f
with get_file_and_metadata() as f:
pd.read_csv(f, encoding="utf-8")