Search code examples
pythonnumpyplotnpjmathplot

Using a condition from file data to plot with matplotlib python


I have data in a file. It looks like this:

08:00,user1,1   
08:10,user3,2   
08:15,empty,0  
....

How could I plot binary data with hours on x-axis and users on y-axis. Users will be denoted with different markers according to user. For example, user1 to be denoted as * and user3 to be denoted as o. And y-axis is 1 for user and 0 for empty. The numbers (in text file) after the username are meant to decide in condition statement which marker it will be.

Here is a pic of what I want to do.

enter image description here


Solution

  • You can load the file with np.recfromcsv . We then convert the time column into datetime objects, for which we define a convtime function. Then we use this function to read in your CSV file.

    import numpy as np
    import matplotlib.pyplot as plt
    convtime = lambda x: datetime.datetime.strptime(x, "%H:%M")
    all_records = np.recfromcsv("myfilename.csv", names=["time", "user", "val"], converters={0:convtime}) # This says parse column 0 using the convtime function
    

    Note that since we have given just the time part to datetime, it will assume the date as 1 January 1900. You can add a relevant date to it if you care.

    Now, to plot the data. This brings us to a curious problem where matplotlib can use only one symbol for all points being plotted. Unfortunately, this means we have to use a for loop. First, let us define dicts for the symbol and colour for each user:

    symbols = {'user1':'*', 'user3':'o', 'empty':'x'}
    colours = {'user1':'blue', 'user3':'red', 'empty':'orange'}
    for rec in all_records:
        plt.scatter(rec['time'], rec['val'], c=colours[rec['user']], marker=symbols[rec['user']])
    

    That almost does it. We are still missing the legend. A drawback of this for loop is that every row in your file will make one entry in the legend. We beat this by creating a custom legend.

    import matplotlib.lines as mlines
    legend_list = []
    for user in symbols.keys():
        legend_list.append(mlines.Line2D([], [], color=colours[user], marker=symbols[user], ls='none', label=user))
    plt.legend(loc='upper right', handles=legend_list)
    plt.show()
    

    That does it! If your plot appears squished, then use plt.xlim() to adjust limits to your taste.