I have data in a file. It looks like this:
08:00,user1,1
08:10,user3,2
08:15,empty,0
....
How could I plot binary data with hours on x-axis
and users on y-axis
. Users will be denoted with different markers according to user. For example, user1
to be denoted as *
and user3
to be denoted as o
. And y-axis
is 1
for user and 0
for empty.
The numbers (in text file) after the username are meant to decide
in condition statement which marker it will be.
Here is a pic of what I want to do.
You can load the file with np.recfromcsv . We then convert the time column into datetime objects, for which we define a convtime
function. Then we use this function to read in your CSV file.
import numpy as np
import matplotlib.pyplot as plt
convtime = lambda x: datetime.datetime.strptime(x, "%H:%M")
all_records = np.recfromcsv("myfilename.csv", names=["time", "user", "val"], converters={0:convtime}) # This says parse column 0 using the convtime function
Note that since we have given just the time part to datetime
, it will assume the date as 1 January 1900. You can add a relevant date to it if you care.
Now, to plot the data. This brings us to a curious problem where matplotlib can use only one symbol for all points being plotted. Unfortunately, this means we have to use a for loop. First, let us define dict
s for the symbol and colour for each user:
symbols = {'user1':'*', 'user3':'o', 'empty':'x'}
colours = {'user1':'blue', 'user3':'red', 'empty':'orange'}
for rec in all_records:
plt.scatter(rec['time'], rec['val'], c=colours[rec['user']], marker=symbols[rec['user']])
That almost does it. We are still missing the legend. A drawback of this for loop is that every row in your file will make one entry in the legend. We beat this by creating a custom legend.
import matplotlib.lines as mlines
legend_list = []
for user in symbols.keys():
legend_list.append(mlines.Line2D([], [], color=colours[user], marker=symbols[user], ls='none', label=user))
plt.legend(loc='upper right', handles=legend_list)
plt.show()
That does it! If your plot appears squished, then use plt.xlim()
to adjust limits to your taste.