Search code examples
pythonpandasglobpathlib

How to read in a file and output the file with the same file name into another subfolder using pathlib.Path().glob()


How do I read in files and output the files with the same file names into another subfolder using pathlib.Path().glob()?

My directory looks like this:

New Folder 1

-> p1_a.csv

-> p1_b.csv

-> New Folder 2

Code:

from pathlib import Path
import pandas as pd

file_path = r'C:\Users\HP\Desktop\New Folder 1'

for fle in Path(file_path).glob('p1_*.csv'):
   df = pd.read_csv(fle)

   # do something with df

   df.to_excel(file_path + r'\New Folder 2' + 'p1_*.csv' + '_new.csv')

The bit of the code which I am not sure about is 'p1_*.csv'.

After the code is run, my directory should look like this:

New Folder 1

-> p1_a.csv

-> p1_b.csv

-> New Folder 2

-> -> p1_a.csv_new.csv

-> -> p1_b.csv_new.csv

What do I need to have in the 'p1_*.csv' bit so that the new files copied in New Folder 2 has part of the same file names as the original file?

Many thanks in advance.


Solution

  • Try this:

    import os
    import pandas as pd
    from pathlib import Path
    
    source_dir = r'C:/Users/HP/Desktop/New Folder 1'
    
    for path in Path(source_dir).glob('p1_*.csv'):
       df = pd.read_csv(path)
       # TODO: do something with df
    
       filename = os.path.basename(path) #--> gets the file name
       dest_path = os.path.join(source_dir, "New Folder 2", f"{filename}_new.csv")
    
       df.to_csv(dest_path)
    

    Now your directory structure will look like:

    New Folder 1
    ├── New Folder 2
    │   ├── p1_a.csv_new.csv
    │   └── p1_b.csv_new.csv
    ├── p1_a.csv
    └── p1_b.csv