Search code examples
pythonpandaspathlib

Open a file with a specific extension, regardless of the name


I have the following code:

folder_names = []
spreadsheet_contents = []
all_data = pd.DataFrame()
current_directory = Path.cwd()


for folder in current_directory.iterdir():
    
    folder_names.append(folder.name)
    
    file_n = '*.csv'
         
    spreadsheet_path = folder / file_n
    spreadsheet_contents.append(pd.read_excel(spreadsheet_path, skiprows = 1, header = None, usecols = [5]))
    

The problem is that the .csv files in each folder are named differently. The '*.csv' method does not work. Does anyone have an idea how to open the .csv file for each subfolder even though they are all named differently?


Solution

  • It seems like you are using pathlib for the paths. pathlib supports recursive globbing using the ** syntax (can be quite slow though):

    files = Path('.').glob('**/*.csv')
    

    For reading the files you can do something like (passing the arguments that suits your file structure):

    pd.concat(pd.read_csv(f) for f in files)