Search code examples
python-3.xiogzipcode-readability

How to best manage file either gz or not?


Hi I wonder if there is a better way in terms of code readability and repetition.

I have a large file that do not fit in memory. The file is either compressed .gz or not.

If it is compressed I need to open it using gzip from standard lib.

I am not sure the code I ended up is the best way to deal with that situation.

import gzip
from Path import pathlib

def parse_open_file(openfile):
    """parse the content of the file"""
    return

def parse_file(file_: Path):
    if file.suffix == ".gz":
        with gzip.open(file_, 'rb') as f:
            parse_open_file(f)
    else:
        with open(file_, 'rb') as f:
            parse_open_file(f)

Solution

  • One way to handle this is to assign either open or gzip.open to a variable, depending on file type, then use that as an 'alias' in the with statement. For example:

    if file.suffix == ".gz":
      myOpen = gzip.open
    else:
      myOpen = open
    
    with myOpen(file_, 'rb') as f:
      parse_open_file(f)