Search code examples
pythonzipunzip

how to read multiple .gz files in a particular directory in python without unzipping them


I have a folder /var/tmp in my linux directory where i have multiple .gz files in the below mentioned format (name_yyyymmddhhmmss.gz).

aakashdeep_20181120080005.gz aakashdeep_20181120080025.gz kalpana_20181119080005.gz aakashdeep_20181120080025.gz

Now i want to open all the gz files with format as name_20181120*.gz without unzipping them and read the content out of them.

i have written a simple code

!/usr/bin/python

import gzip

output = gzip.open('/var/tmp/Aakashdeep/aakashdeep_20181120080002.gz','r')

for line in output: print (line)

and the same is giving me the output as expected, but i want to open all the files like below output = gzip.open('/var/tmp/Aakashdeep/aakashdeep_20181120*.gz','r')

Can anyone suggest me the way for this.??


Solution

  • Use glob.glob to obtain a list of files to process, then open each with gzip.open, do something with its contents, and move on to the next. Outline (untested):

    import glob
    import gzip
    
    ZIPFILES='/var/tmp/Aakashdeep/aakashdeep_20181120*.gz'
    
    filelist = glob.glob(ZIPFILES)
    for gzfile in filelist:
        # print("#Starting " + gzfile)  #if you want to know which file is being processed  
        with gzip.open( gzfile, 'r') as f:
            for line in f:
                 print(line)