I have a directory that is consistently having .csv files appended (1 or 2 every 30 min).
My pandas script merges and cleans two of the latest .csv within the dir (two paths are currently added manually) and then saves a .csv of their differences within a different dir.
However to mitigate the current manual process I would like to obtain the paths of the 2 most recent csv's and assign them to the left df and right df for the initial merge?
It would be preferable to sort the dir by date created and then use an index the assign most recent (in this case [0], [1])
I have tried modifying the snippet below however this only yields the latest .csv
from pathlib import Path
left_path = '/home/user/some_folder/csv1'
files = Path(left_path).glob('*.csv')
latest_left = max(files, key=lambda f: f.stat().st_mtime)
right_path = '/home/user/some_folder/csv2'
files = Path(right_path).glob('*.csv')
latest_right = max(files, key=lambda f: f.stat().st_mtime)
Thanks for the help!
You were almost there!
if you make a list of the files in your directory and then sort those by creation time you can access the last two entries in the list:
files = list(Path(path).glob('*.csv'))
files.sort(key=lambda f: f.stat().st_mtime)
csv1 = files[-1]
csv2 = files[-2]