Search code examples
pandascsvappendgzip

How to read in multiple .csv.gz files at once and append them with pandas


I have a directory (myDir) that contains 3,169 .csv.gz files. For example:

file1.csv.gz
file2.csv.gz
...
file3169.csv.gz

All files have exactly the same layout.

Is it possible to read them all in at once with pandas and collate/append (not merge!!) them?


Solution

  • You can use pathlib instead of glob:

    import pathlib
    
    myDir = pathlib.Path('my_folder')
    df = pd.concat([pd.read_csv(filename) for filename in myDir.glob('*.csv.gz')], 
                   ignore_index=True)