Search code examples
pythonpython-3.xcsvbytesio

Why is TextIOWrapper closing the given BytesIO stream?


If I run following code in python 3

from io import BytesIO
import csv
from io import TextIOWrapper


def fill_into_stringio(input_io):
    writer = csv.DictWriter(TextIOWrapper(input_io, encoding='utf-8'),fieldnames=['ids'])
    for i in range(100):
        writer.writerow({'ids': str(i)})

with BytesIO() as input_i:
    fill_into_stringio(input_i)
    input_i.seek(0)

I get an error:

ValueError: I/O operation on closed file.

While if I don't use the TextIOWrapper the io stream stays open. As an example if I modify my function to

def fill_into_stringio(input_io):
    for i in range(100):
        input_io.write(b'erwfewfwef')

I don't get any errors any more so for some reason TestIOWrapper is closing the stream from which I would like to read afterwards. Is this intended to be like this and whether it is is there a way to achieve what I am trying without writing the csv writer myself?


Solution

  • The csv module is the weird one here; most file-like objects that wrap other objects assume ownership of the object in question, closing it when they themselves are closed (or cleaned up in some other way).

    One way to avoid the problem is to explicitly detach from the TextIOWrapper before allowing it to be cleaned up:

    def fill_into_stringio(input_io):
        # write_through=True prevents TextIOWrapper from buffering internally;
        # you could replace it with explicit flushes, but you want something 
        # to ensure nothing is left in the TextIOWrapper when you detach
        text_input = TextIOWrapper(input_io, encoding='utf-8', write_through=True)
        try:
            writer = csv.DictWriter(text_input, fieldnames=['ids'])
            for i in range(100):
                writer.writerow({'ids': str(i)})
        finally:
            text_input.detach()  # Detaches input_io so it won't be closed when text_input cleaned up
    

    The only other built-in way to avoid this is for real file objects, where you can pass them a file descriptor and closefd=False and they won't close the underlying file descriptor when close-ed or otherwise cleaned up.

    Of course, in your particular case, there is simpler way: Just make your function expect text based file-like objects and use them without rewrapping; your function really shouldn't be responsible for imposing encoding on the caller's output file (what if the caller wanted UTF-16 output?).

    Then you can do:

    from io import StringIO
    
    def fill_into_stringio(input_io):
        writer = csv.DictWriter(input_io, fieldnames=['ids'])
        for i in range(100):
            writer.writerow({'ids': str(i)})
    
    # newline='' is the Python 3 way to prevent line-ending translation
    # while continuing to operate as text, and it's recommended for any file
    # used with the csv module
    with StringIO(newline='') as input_i:
        fill_into_stringio(input_i)
        input_i.seek(0)
        # If you really need UTF-8 bytes as output, you can make a BytesIO at this point with:
        # BytesIO(input_i.getvalue().encode('utf-8'))