Search code examples
pythonpandasmatplotlibdatetimetime-series

How do I plot a dataframe where the x-axis is hh:mm:ss?


I am trying the elements of df.

import matplotlib.pyplot as plt
import pandas as pd


original_indicies = ['00:00:00', '00:15:00', '00:30:00', '00:45:00', '01:00:00', '01:15:00',
       '01:30:00', '01:45:00', '02:00:00', '02:15:00', '02:30:00', '02:45:00',
       '03:00:00', '03:15:00', '03:30:00', '03:45:00', '04:00:00', '04:15:00',
       '04:30:00', '04:45:00', '05:00:00', '05:15:00', '05:30:00', '05:45:00',
       '06:00:00', '06:15:00', '06:30:00', '06:45:00', '07:00:00', '07:15:00',
       '07:30:00', '07:45:00', '08:00:00', '08:15:00', '08:30:00', '08:45:00',
       '09:00:00', '09:15:00', '09:30:00', '09:45:00', '10:00:00', '10:15:00',
       '10:30:00', '10:45:00', '11:00:00', '11:15:00', '11:30:00', '11:45:00',
       '12:00:00', '12:15:00', '12:30:00', '12:45:00', '13:00:00', '13:15:00',
       '13:30:00', '13:45:00', '14:00:00', '14:15:00', '14:30:00', '14:45:00',
       '15:00:00', '15:15:00', '15:30:00', '15:45:00', '16:00:00', '16:15:00',
       '16:30:00', '16:45:00', '17:00:00', '17:15:00', '17:30:00', '17:45:00',
       '18:00:00', '18:15:00', '18:30:00', '18:45:00', '19:00:00', '19:15:00',
       '19:30:00', '19:45:00', '20:00:00', '20:15:00', '20:30:00', '20:45:00',
       '21:00:00', '21:15:00', '21:30:00', '21:45:00', '22:00:00', '22:15:00',
       '22:30:00', '22:45:00', '23:00:00', '23:15:00', '23:30:00', '23:45:00']
var_1 = [x for x in range(0,len(original_indicies))]
var_2 = [x for x in range(0,len(original_indicies))]
var_3 = [x for x in range(0,len(original_indicies))]

df = pd.DataFrame(data={'var_1':var_1,'var_2':var_2,'var_3':var_3},index=original_indicies)

However, the index is of type object. I have tried converting to a time datatype as follows

but I can't plot without including the date information. I want to plot only showing the hours and minutes.

Please note, the plot function cannot take a dt.time object type and I cannot call a .time() method on the column to produce times for plotting. I have exhausted the methods in pandas's documentation.


Solution

    • There are two easy options to deal with the issue
      1. Leave the index in the current form, as strings, letting the plotting API deal with the placement of the xticks.
        • This works because the times are evenly spaced at 15 minute intervals, so the spacing between the xticks won't matter.
        • This would not work if the times are at different intervals, because the ticks will be evenly spaced.
      2. Convert the values to datetime.time objects and then plot.
        • This will work regardless of the time interval.
        • Use df.time = pd.to_datetime(df.time, format='%H:%M:%S').dt.time or df.time = df.time.apply(pd.Timestamp).dt.time, where 'time' is the column of time strings.
        • df.index = pd.to_datetime(df.index, format='%H:%M:%S').time also works if the time strings are in the index.
    • Tested in python 3.11.2, pandas 2.0.1, matplotlib 3.7.1

    1.

    • This option creates a tick for the number of time value strings in the dataframe.
    df = pd.DataFrame(data={'var_1': var_1, 'var_2': var_2, 'var_3': var_3}, index=original_indicies)
    
    # plot and specify the number of ticks
    ax = df.plot(xticks=range(0, len(df), 10), figsize=(10, 6))
    
    print(ax.get_xticklabels())
    
    [Text(0, 0, '00:00:00'), Text(10, 0, '02:30:00'), Text(20, 0, '05:00:00'), Text(30, 0, '07:30:00'), Text(40, 0, '10:00:00'), Text(50, 0, '12:30:00'), Text(60, 0, '15:00:00'), Text(70, 0, '17:30:00'), Text(80, 0, '20:00:00'), Text(90, 0, '22:30:00')]
    
    • There are 120 tick indices, -20 to 100, and there are 96 time values in the dataframe.

    enter image description here

    • Using only ax = df.plot(figsize=(10, 6))

    enter image description here

    2.

    • This option creates a tick for every second in a 24-hour period.
    df = pd.DataFrame(data={'var_1': var_1, 'var_2': var_2, 'var_3': var_3}, index=original_indicies)
    
    # remove the time values from the index and rename the column
    df = df.reset_index().rename({'index': 'time'}, axis=1)
    
    # set the type of the time column
    df.time = df.time.apply(pd.Timestamp).dt.time
    
    # plot
    ax = df.plot(x='time', figsize=(10, 6))
    
    print(ax.get_xticklabels())
    
    • There are 120000 tick indices, -20000 to 100000, and there are 86400 seconds in a 24-hour period.
    [Text(-20000.0, 0, '18:26:40'), Text(0.0, 0, '00:00'), Text(20000.0, 0, '05:33:20'), Text(40000.0, 0, '11:06:40'), Text(60000.0, 0, '16:40'), Text(80000.0, 0, '22:13:20'), Text(100000.0, 0, '03:46:40')]
    
    • The plot API manages the number of ticks on the x-axis

    enter image description here