Search code examples
pythonpandascsvfor-loopfile-rename

Python pandas: dynamically naming loaded CSV files


I'm reading in a selection of consistently named CSV files from the same directory. I'm looking to load them such that the variable names change based on the file name, i.e.:

food_list=['apples','oranges','pears']
place_list=['bodega','grocery']

for i in range(0, len(food_list)):
    for j in range(0,len(place_list)):
        grocery=pd.read_csv(str(food_list[i])+'_'+str(place_list[j])+'_train.csv')
        new_name=str(food_list[i])+'_'+str(place_list[j])+'_train'
        train=new_name
        test=pd.read_csv(str(food_list[i])+'_'+str(place_list[j])+'_test.csv')
        new_name=str(food_list[i])+'_'+str(place_list[j])+'_test'
        test=new_name

### Desired output:
apples_bodega_train # is a dataframe
apples_bodega_test # is a dataframe
...
pears_grocery_train # is a dataframe
pears_grocery_test # is a dataframe

### Actual output:
train # pears_grocery_train
test # pears_grocery_test

So, I'm clearly just overwriting the loaded CSV dataframe names "train" and "test" within each loop iteration with other useless variable names, versus just renaming the loaded dataframes. Could someone enlighten me regarding the intelligent way to go about this?


Solution

  • That would actually be possible with the eval() function, but that's definitely not what you want to do. How about saving the dataframes in a dictionary? Like This:

    dataframes = dict()
    dataframes[str(food_list[i])+'_'+str(place_list[j])+'_train'] = pd.read_csv(str(food_list[i])+'_'+str(place_list[j])+'_train.csv')
    dataframes[str(food_list[i])+'_'+str(place_list[j])+'_test'] = pd.read_csv(str(food_list[i])+'_'+str(place_list[j])+'_test.csv')