Search code examples
pythonpandasglob

Pandas name dataframe from a string in csv name


I have several csv with a string in their name (e.g city name) and want to read them in dataframe with the names derived from that city name.

example of csv names: data_paris.csv , data_berlin.csv

How can I read them in a loop to get df_paris and df_berlin?

What I tried so far:

all_files = glob.glob(./*.csv")

for filename in all_files:
    city_name=re.split("[_.]", filename)[1] #to extract city name from filename
    dfname= {'df' + str(city_name)}
    print(dfname)
    dfname= pd.read_csv(filename)

I expect to have df_rome and df_paris, but I get just dfname. Why?

A related question: Name a dataframe based on csv file name?

Thank you!


Solution

  • I would recommend against automatic dynamic naming like df_paris, df_berlin. Instead, you should do:

    all_files = glob.glob("./*.csv")
    
    # dictionary of dataframes
    dfs = dict()
    for filename in all_files:
        city_name=re.split("[_.]", filename)[1] # to extract city name from filename
    
        dfs[city_name] =  pd.read_csv(filename) # assign to the dataframe dictionary