I have a .csv file with only two columns in it, date and time:
04-02-15,11:15
04-03-15,09:35
04-04-15,09:10
04-05-15,18:05
04-06-15,10:30
04-07-15,09:20
I need this data to be plotted (preferably in an area graph, haven't gotten that far yet) using matplotlib. I need the y-axis to be time, and the x-axis to be date. I'm having trouble wrapping my head around some of the usage for time/date, and was hoping someone could take a look at my code and offer some guidance:
import numpy as np
from pylab import *
import matplotlib.pyplot as plt
import datetime as DT
data= np.loadtxt('daily_count.csv', delimiter=',',
dtype={'names': ('date', 'time'),'formats': ('S10', 'S10')} )
x = [DT.datetime.strptime(key,"%m-%d-%y") for (key, value) in data ]
y = [DT.datetime.strptime(key,"%h:%m") for (key, value) in data]
fig = plt.figure()
ax = fig.add_subplot(111)
ax.grid()
fig.autofmt_xdate()
fig.autofmt_ytime()
plt.plot(x,y)
plt.xlabel('Date')
plt.ylabel('Time')
plt.title('Peak Time')
plt.show()
Each time I try to run it, I get this error:
ValueError: time data '04-02-15' does not match format '%h:%m'
I've also got a suspicion about the ticks for the y-axis, which thus far don't seem to be established. I'm very open to suggestions for the rest of this code as well - thanks in advance, internet heroes!
So the traceback tells you the problem. It is trying to parse your date as your time, and this is a result of the way you parsed the data in these lines:
data= np.loadtxt('daily_count.csv', delimiter=',',
dtype={'names': ('date', 'time'),'formats': ('S10', 'S10')} )
x = [DT.datetime.strptime(key,"%m-%d-%y") for (key, value) in data ]
y = [DT.datetime.strptime(key,"%h:%m") for (key, value) in data]
There are multiple solutions, but the root of the 'problem; is that when you use loadtxt and define the names and dtypes, it gives you back a list of tuples, i.e.,
[('04-02-15', '11:15') ('04-03-15', '09:35') ('04-04-15', '09:10')
('04-05-15', '18:05') ('04-06-15', '10:30') ('04-07-15', '09:20')]
So when you looped over it, you actually were accessing constantly the dates:
>>> print [key for (key, value) in data]
>>> ['04-02-15', '04-03-15', '04-04-15', '04-05-15', '04-06-15', '04-07-15']
So you were trying to turn '04-02-15' into the format '%h:%m', which of course will not work.
To get to the point, you can unconfuse the parsed data using the zip function. For example,
print map(list, zip(*data))
['04-02-15', '04-03-15', '04-04-15', '04-05-15', '04-06-15', '04-07-15']
['11:15', '09:35', '09:10', '18:05', '10:30', '09:20']
Also, you need to check the formats for the dates you passed, for example "%h:%m" won't work as %h doesn't exist, and %m means month. You can find a nice summary on the docs, or here: http://strftime.org/.
Or to get to the point:
import numpy as np
from pylab import *
import matplotlib.pyplot as plt
import datetime as DT
data= np.loadtxt('daily_count.csv', delimiter=',',
dtype={'names': ('date', 'time'),'formats': ('S10', 'S10')} )
dates, times = map(list, zip(*data))
print dates, times
x = [DT.datetime.strptime(date,"%m-%d-%y") for date in dates]
y = [DT.datetime.strptime(time,"%H:%M") for time in times]
fig = plt.figure()
ax = fig.add_subplot(111)
ax.grid()
plt.plot(x,y)
plt.xlabel('Date')
plt.ylabel('Time')
plt.title('Peak Time')
plt.show()
which gives the following plot: