Search code examples
pythonpandasticker

Removing timestamp, day and month from Pandas dataframe for a time series plot


import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

I have two dataframes I want to plot two time series for. I want the time series to be stacked on top of each other which I'm struggling to get to work for a Pandas dataframe. Furthermore, I am having issues with the x-tickers.

The dates and times are given as strings and are already chronologically ordered. Here is a sample of the data (taken from a larger set of data)

df1 = pd.DataFrame([["2004-03-01 00:00", 2.3],
              ["2004-03-05 00:00", 2.4],
              ["2004-03-25 00:00", 2.25],
              ["2004-07-01 00:00", 2.7],
              ["2005-01-01 00:00", 2.9],
              ["2005-02-17 00:00", 3.1],
              ["2005-12-01 00:00", 3.5],
              ["2006-02-01 00:00", 3.3],
              ["2006-04-05 00:00", 3.08],
              ["2006-08-22 00:00", 2.4],
              ["2007-07-01 00:00", 2.1]], columns = ['Date and Time', 'Values 1'])


df2 = pd.DataFrame([["2004-03-01 00:00", 12.3],
              ["2004-03-05 00:00", 14.5],
              ["2004-03-25 00:00", 12.1],
              ["2004-07-01 00:00", 10.0],
              ["2005-01-01 00:00", 12.1],
              ["2005-02-17 00:00", 9.3],
              ["2005-12-01 00:00", 8.1],
              ["2006-02-01 00:00", 6.5],
              ["2006-04-05 00:00", 7.5],
              ["2006-08-22 00:00", 6.4],
              ["2007-07-01 00:00", 4.1]], columns = ['Date and Time', 'Values 2'])

first of all. If I try to plot df1,

df1.plot(x='Date and Time', y='Values 1',  legend=False)
plt.xlabel('Year')
plt.ylabel('Values 1')
plt.show()

the output is the graph I want, but the x-tickers are formatted years-month-date-time. In this example, I want just "years" 2004, 2005, 2006, 2007 to appear as tickers and more importantly, that they are correctly scaled (so the 2005 ticker will be close to the "2005-01-01" datapoint). Is this possible?

Furthermore, I want to plot these graphs stacked on top of each other, I tried the code below to no avail.

plt.figure(1)

plt.subplot(211)
df1.plot(x='Date and Time', y='Values 1', legend=False)
plt.xlabel('Year')
plt.ylabel('Values 1')

plt.subplot(212)
df2.plot(x='Date and Time', y='Values 2', legend=False)
plt.xlabel('Year')
plt.ylabel('Values 2')

plt.show()

Solution

  • Let's use set_major_formatter and set_major_locator on xaxis and plt.subplots:

    import pandas as pd
    import matplotlib.pyplot as plt
    import numpy as np
    import matplotlib.dates as mdates
    
    df1 = pd.DataFrame([["2004-03-01 00:00", 2.3],
                  ["2004-03-05 00:00", 2.4],
                  ["2004-03-25 00:00", 2.25],
                  ["2004-07-01 00:00", 2.7],
                  ["2005-01-01 00:00", 2.9],
                  ["2005-02-17 00:00", 3.1],
                  ["2005-12-01 00:00", 3.5],
                  ["2006-02-01 00:00", 3.3],
                  ["2006-04-05 00:00", 3.08],
                  ["2006-08-22 00:00", 2.4],
                  ["2007-07-01 00:00", 2.1]], columns = ['Date and Time', 'Values 1'])
    
    
    df2 = pd.DataFrame([["2004-03-01 00:00", 12.3],
                  ["2004-03-05 00:00", 14.5],
                  ["2004-03-25 00:00", 12.1],
                  ["2004-07-01 00:00", 10.0],
                  ["2005-01-01 00:00", 12.1],
                  ["2005-02-17 00:00", 9.3],
                  ["2005-12-01 00:00", 8.1],
                  ["2006-02-01 00:00", 6.5],
                  ["2006-04-05 00:00", 7.5],
                  ["2006-08-22 00:00", 6.4],
                  ["2007-07-01 00:00", 4.1]], columns = ['Date and Time', 'Values 2'])
    
    yearFmt = mdates.DateFormatter('%Y')
    years = mdates.YearLocator()  
    
    
    df1['Date and Time'] = pd.to_datetime(df1['Date and Time'])
    fig, ax = plt.subplots(2,1, figsize=(10,10))
    df1.plot(x='Date and Time', y='Values 1', legend=False, ax=ax[0])
    ax[0].set_xlabel('Year')
    ax[0].set_ylabel('Values 1')
    ax[0].xaxis.set_major_formatter(yearFmt)
    ax[0].xaxis.set_major_locator(years)
    
    
    df2['Date and Time'] = pd.to_datetime(df2['Date and Time'])
    df2.plot(x='Date and Time', y='Values 2', legend=False, ax=ax[1])
    ax[1].set_xlabel('Year')
    ax[1].set_ylabel('Values 2')
    ax[1].xaxis.set_major_formatter(yearFmt)
    ax[1].xaxis.set_major_locator(years)
    
    plt.tight_layout()
    plt.show()
    

    Output:

    enter image description here