Search code examples
python-3.xdataframeglobwritefilecreate-directory

Create folders dynamically and write csv files to that folders


I would like to read several input files from a folder, perform some transformations,create folders on the fly and write the csv to corresponding folders. The point here is I have the input path which is like

"Input files\P1_set1\Set1_Folder_1_File_1_Hour09.csv" - for a single patient (This file contains readings of patient (P1) at 9th hour)

Similarly, there are multiple files for each patient and each patient files are grouped under each folder as shown below

enter image description here

So, to read each file, I am using wildcard regex as shown below in code

I have already tried using the glob package and am able to read it successfully but am facing issue while creating the output folders and saving the files. I am parsing the file string as shown below

f = "Input files\P1_set1\Set1_Folder_1_File_1_Hour09.csv"

f[12:] = "P1_set1\Set1_Folder_1_File_1_Hour09.csv"

filenames = sorted(glob.glob('Input files\P*_set1\*.csv'))
for f in filenames:
   print(f)     #This will print the full path
   print(f[12:]) # This print the folder structure along with filename
   df_transform = pd.read_csv(f)
   df_transform = df_transform.drop(['Format 10','Time','Hour'],axis=1)
   df_transform.to_csv("Output\" + str(f[12:]),index=False)

I expect the output folder to have the csv files which are grouped by each patient under their respective folders. The screenshot below shows how the transformed files should be arranged in output folder (same structure as input folder). Please note that "Output" folder is already existing (it's easy to create one folder you know) enter image description here


Solution

  • So to read files in a folder use os library then you can do

    import os
    folder_path = "path_to_your_folder"
    dir = os.listdir(folder_path)
    for x in dir:
        df_transform = pd.read_csv(f)
        df_transform = df_transform.drop(['Format 10','Time','Hour'],axis=1)
        if os.path.isdir("/home/el"):
            df_transform.to_csv("Output/" + str(f[12:]),index=False)
        else:
            os.makedirs(folder_path+"/")
            df_transform.to_csv("Output/" + str(f[12:]),index=False)    
    

    Now instead of user f[12:] split the x in for loop like

    file_name = x.split('/')[-1] #if you want filename.csv
    

    Let me know if this is what you wanted