The prompt:
Write a program that categorizes each mail message by which day of the week the commit was done. To do this look for lines that start with "from", then look for the third word and keep a runnning count of each of the days of the week. At the end of the program print out the contents of your dictionary (order does not matter).
The code in Python 3:
fname = input('enter file name:')
fhand = None
days = dict()
try:
fhand = open(fname)
except:
print(fname, 'is not a file thank you have a nice day and stop trying to ruin my program\n')
exit()
for line in fhand:
sline = line.split()
if line.startswith('From'):
print (sline)
day = sline[2]
if day not in days:
days[day] = 1
else:
days[day] += 1
print(days)
The problem:
['From', 'stephen.marquard@uct.ac.za', 'Sat', 'Jan', '5', '09:14:16', '2008']
**['From:', 'stephen.marquard@uct.ac.za']**
Traceback (most recent call last):
File "C:\Users\s_kestlert\Desktop\Programming\python\chap9.py", line 13, in <module>
day = sline[2]
IndexError: list index out of range
The file: http://www.py4inf.com/code/mbox-short.txt
Why does the .split
cut the line down to only [0]
and [1]
?
How can I circumvent this?
Looking at the file you linked, I think you need to change your line.startswith('From')
to line.startswith('From ')
(note the trailing space). The From: ...
header lines are being matched (and only have 2 words), when I think you only want the From ...
lines containing more information.