Problem
I have multliple directories each with subdirectories. These subdirectories contain .csv files with numerical data in them. I want to us glob and os (not shell scripts) to search two specified directories and then locate specific folders and concatenate them in a format I will describe below.
dir1 contains subdir1 contains A.csv
contains subdir2 contains B.csv
dir2 contains subdir1 contains A.csv
contains subdir2 contains B.csv
IN BOTH CASES
>>> cat A.csv
1
2
3
4
5
>>> cat B.csv
6
7
8
9
10
MY DESIRED BEHAVIOUR
Find A.csv in dir1 and find A.csv in dir2, searching every folder and directory, and then merge them. After merge, create pandas.DataFrame
>>> python3 merge.py dir1 dir2 A.csv
# prints df created from out.csv
x y
0 1 1
1 2 2
2 3 3
3 4 4
4 5 5
>>> cat out.csv
1
2
3
4
5
1
2
3
4
5
ASK QUESTIONS IF NEEDED
You can use os.walk
to walk through directories and glob.glob
to search for *.csv files like so:
from os import walk
from os.path import join
from glob import glob
root_dir = '/some/path/to_a_directory/'
for rootdir, _, _ in walk(root_dir):
all_csv = glob(join(root_dir, '*.csv'))
for fpath in all_csv:
# Open the file and do something with it