Concatenating multiple dataframes. Issue with datapaths

I want to concatenate several csv files which I saved in a directory ./Errormeasure. In order to do so, I used the following answer from another thread https://stackoverflow.com/a/51118604/9109556

filepaths =[f for f in listdir('./Errormeasure')if f.endswith('.csv')]
df=pd.concat(map(pd.read_csv,filepaths))
print(df)

However, this code only works, when I have the csv files I want to concatentate both in the ./Errormeasure directory as well as in the directory below, ./venv. This however is obviously not convenient. When I have the csv files only in the ./Errormeasure, I recieve the following error:

FileNotFoundError: [Errno 2] File b'errormeasure_871687110001543570.csv' does not exist: b'errormeasure_871687110001543570.csv'

Can you give me some tips to tackle this problem? I am using pycharm. Thanks in advance!

Solution

Using os.listdir() only retrieves file names and not parent folders which is needed for pandas.read_csv() at relative (where pandas script resides) or absolute levels.

Instead consider the recursive feature of built-in glob (only available in Python 3.5+) to return full paths of all csv files at top level and subfolders.

import glob

for f in glob.glob(dirpath + "/**/*.csv", recursive=True):
    print(f)

From there build data frames in list comprehension (bypassing map -see List comprehension vs map) to be concatenated with pd.concat:

df_files = [pd.read_csv(f) for f in glob.glob(dirpath + "/**/*.csv", recursive=True)]
df = pd.concat(df_files)
print(df)

For Python < 3.5, consider os.walk() + os.listdir() to retrieve full paths of csv files:

import os
import pandas as pd

# COMBINE CSVs IN CURR FOLDER + SUB FOLDERS
fpaths = [os.path.join(dirpath, f) 
            for f in os.listdir(dirpath) if f.endswith('.csv')] + \
         [os.path.join(fdir, fld, f) 
            for fdir, flds, ffile in os.walk(dirpath) 
            for fld in flds  
            for f in os.listdir(os.path.join(fdir, fld)) if f.endswith('.csv')]

df = pd.concat([pd.read_csv(f) in for f in fpaths])
print(df)