Search code examples
pythonwith-statement

Return file handles opened with with open?


I'm creating software where I want to accept compressed files. Since files are read/written everywhere, I created a utility function for opening files, that handles opening/closing for me, for some compressed filetypes.

Example code:

def return_file_handle(input_file, open_mode="r"):
    """ Handles compressed and uncompressed files. Accepts open modes r/w/w+ """

    if input_file.endswith(".gz")
        with gzip.open(input_file, open_mode) as gzipped_file_handle:
            return gzipped_file_handle

The problem is, when using this code, the file handle seems to close when the function returns. I it possible to do what I want with with open or do I need to handle closing myself?

Add this to the code above to get a minimal non-working example:

for line in return_file_handle(input_bed, "rb"):
    print line

Create a gzipped textfile with:

echo "hei\nder!" | gzip - > test.gz

Error message:

Traceback (most recent call last):
  File "check_bed_against_blacklist.py", line 26, in <module>
    check_bed_against_blacklist("test.gz", "bla")
  File "check_bed_against_blacklist.py", line 15, in check_bed_against_blacklist
    for line in return_file_handle(input_bed, "r"):
ValueError: I/O operation on closed file.

Solution

  • Try it as a generator:

    def return_file_handle(input_file, open_mode="r"):
        """
        Handles compressed and uncompressed files. Accepts open modes r/w/w+
        """
        # compressed
        if input_file.endswith(".gz"):
            with gzip.open(input_file, open_mode) as gzipped_file_handle:
                yield gzipped_file_handle
        else:
            with open(input_file, open_mode) as normal_fh:
                yield normal_fh
    

    When you call it:

    for line in return_file_handle("file.gz"):
        print(line.read())
    

    Or composing a generator using python's new yield from syntax:

    def each_line(fh):
        for l in fh:
            yield from l
    

    And calling it:

    for each in each_line(return_file_handle(fh)):
        print(each)
    

    with the file cleanly closing at the end of the for loop.