I am writing a script in python in order to navigate to a folder on my desktop, read files (using glob patterns as I will be adding files everyday or so) and copying their content in one separate .txt file.
I wrote the below script:
#!/usr/bin/env python3
with open('../python_diary.txt', 'w') as outfile:
for filename in glob.glob('../Desktop/diary/*-2020.txt'):
with open(filename) as infile:
for line in infile:
outfile.write(line)
The script generally works fine, but my files are in a dd-mm-yyyy format and when launching the script they appear in my destination file in the following order(up to today): 19-06-2020 17-06-2020 16-06-2020 18-06-2020
Any idea how I can make these concatenated files appear from oldest to newest?
Thanks,
You can perform a sort on the glob with a few tricks to get to the datetime. Assuming your timestamps are all zero-padded months and days with a 4-digit year, this will work for you:
import os
from glob import glob
# Grab the filenames matching this glob
filenames = glob.glob('../Desktop/diary/*-2020.txt')
# Sort the filenames by ascending date
def filename_to_isodate(filename):
date = os.path.basename(filename).rsplit('.', 1)[0][-10:]
return date[-4:] + date[3:5] + date[:2]
filenames = sorted(filenames, key=filename_to_isodate)
for filename in filenames:
... # Your stuff here...
Explanation
os.path.basename
gives us the name of the file, e.g., '../Desktop/diary/01-01-2020.txt' becomes '01-01-2020.txt'
rsplit('.', 1)[0][-:10]
splits the basename by the period, effectively stripping the extension, and only grabbing what is before the extension. The [-10:]
only grabs the 10 characters that make up a date, in this case, 4 for the year + 2 for the month + 2 for the day + 2 dashes = 10 characters.
Last, in the sorting, we use sorted
with the key
to tell the function to sort by ISO date (year, month, day).
edit: following input from @Daniel F, the strptime
from the datetime
module is replaced by simply using the date in ISO string format in sorting for speed purposes. Below was the original method used in this answer.
The built-in datetime
module can be used to parse the datetime by a given format, in this case: %d-%m-%Y
. strptime
gives a datetime
object that can be treated numerically, meaning that it can be compared and thus sorted.
os.path.basename(s).rsplit('.', 1)[0][-10:], '%d-%m-%Y'