Search code examples
pythonmatplotlibpandastimeline

Matplotlib timelines


I'm looking to take a python DataFrame with a bunch of timelines in it and plot these in a single figure. The DataFrame indices are Timestamps and there's a specific column, we'll call "sequence", that contains strings like "A" and "B". So the DataFrame looks something like this:

+--------------------------+---+
| 2014-07-01 00:01:00.0000 | A |
+--------------------------+---+
| 2014-07-01 00:02:00.0000 | B |
+--------------------------+---+
| 2014-07-01 00:04:00.0000 | A |
+--------------------------+---+
| 2014-07-01 00:08:00.0000 | A |
+--------------------------+---+
| 2014-07-01 00:08:00.0000 | B |
+--------------------------+---+
| 2014-07-01 00:10:00.0000 | B |
+--------------------------+---+
| 2014-07-01 00:11:00.0000 | B |
+--------------------------+---+

I'm looking for a plot something like this:

B |  *     * **
A | *  *   *
  +------------
    Timestamp

Solution

  • I would just map each category to a y-value using a dictionary.

    import random
    import numpy as np
    import matplotlib.pyplot as plt
    import pandas
    
    categories = list('ABCD')
    
    # map categories to y-values
    cat_dict = dict(zip(categories, range(1, len(categories)+1)))
    
    # map y-values to categories
    val_dict = dict(zip(range(1, len(categories)+1), categories))
    
    # setup the dataframe
    dates = pandas.DatetimeIndex(freq='20T', start='2012-05-05 13:00', end='2012-05-05 18:59')
    values = [random.choice(categories) for _ in range(len(dates))]
    df = pandas.DataFrame(data=values, index=dates, columns=['category'])
    
    # determing the y-values from categories
    df['plotval'] = df['category'].apply(cat_dict.get)
    
    # make the plot
    fig, ax = plt.subplots()
    df['plotval'].plot(ax=ax, style='ks')
    ax.margins(0.2)
    
    # format y-ticks look up the categories
    ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, pos: val_dict.get(x)))
    

    And I get:

    enter image description here

    Note that since you probably already have a dataframe, you can build cat_dict and val_dict like this:

    # map categories to y-values
    cat_dict = dict(zip(pandas.unique(df['category']), range(1, len(categories)+1)))
    
    # map y-values to categories
    val_dict = dict(zip(range(1, len(categories)+1), pandas.unique(df['category'])))