Search code examples
pythonpandasglob

Name a dataframe based on csv file name?


Trying to batch analyze a folder full of .csv files, then save them out again based on the .csv name. However, I'm having trouble extracting just the file name and assigning it to the dataframe (df).

import glob
import pandas as pd

path = r'csv_in'
allFiles = glob.glob(path + '/*.csv')

for file_ in allFiles:   
    df = pd.read_csv(file_, header=0)
    df.name = file_
    print(df.name)

The print result I get is "csv_in/*.csv".

The result I'm looking for is just the csv name, "*.csv"


Solution

  • Create new column with [] and os.path.basename with os.path.normpath:

    import os
    
    for file_ in allFiles:   
        df = pd.read_csv(file_, header=0)
        df['name'] = os.path.basename(os.path.normpath(file_))
        #if need remove extension (csv)
        #df['name'] = os.path.splitext(os.path.basename("hemanth.txt"))[0]
        print(df.name)