I have a file with decent structure, but the dates of subsequent events (one or more) are only printed once. I can't figure out how to read the file, recognize dates, and the map them to each game result that follows, until the next date appears.
The data looks like this:
Sa 19.11.2016
FC Tuggen
FC Basel 1893 II
1
3
SC Cham
FC Zürich II
0
1
SC Kriens
FC Köniz
3
1
Sa 26.11.2016
FC Bavois
SC Brühl
1
4
Mi 30.11.2016
FC Zürich II
FC Basel 1893 II
2
2
Each date can apply to one or more game results. I've tried reading through the file and grepping dates
keys = []
for line in d:
if line[0:2] in ('Sa','So','Mo','Di','Mi','Do','Fr'):
keys.append(line[2:-1].strip())
But then I don't know how to assign the same date to the games the follow, until the next date arrives. For this I've tried various combinations of enumerate(), xrange(), etc. enumerate() didn't work how I tried because I could only add the first game after each date.
My desired output looks as follows, or a defaultdict(list) with keys as the date and array elements as small dictionaries:
Sa 19.11.2016,FC Tuggen,FC Basel 1893 II,1,3
Sa 19.11.2016,SC Cham,FC Zürich II,0,1
Sa 19.11.2016,SC Kriens,FC Köniz,3,1
Sa 26.11.2016,FC Bavois,SC Brühl,1,4
Mi 30.11.2016,FC Zürich II,FC Basel 1893 II,2,2
Something as simply as the following might work, assuming that the input file has a format similar to what you have shown. Keep track of the last seen date using a variable.
lastseendate = None
gameinfo = []
for line in f:
if line[0:2] in ('Sa','So','Mo','Di','Mi','Do','Fr'): # date row
lastseendate = line.strip()
elif len(line.strip()) == 0: # empty line
print(lastseendate + ',' + ','.join(gameinfo)) # print out the row for game just read before
gameinfo = [] # ready to read the next game info
else:
gameinfo.append(line.strip())
If the leading two characters before the date are too many to hardcode, then you could use a regular expression like below.
import re
pat = re.compile("[A-Za-z] \d{2}\.\d{2}\.\d{4}")
Then replace the # date row
line with
if pat.match(line):
\n
in the print statement (unnecessary as print
already prints new line).