Search code examples
pythonpandasmatplotlibdatetimex-axis

How to plot time on the x-axis of a scatter plot for every minute


Setting minute minor ticks for 1-second sampled data raises: OverflowError: int too big to convert

Consider this dataframe with a sample interval of 1 second that spans about 30 minutes:

import matplotlib.pyplot as plt
from matplotlib.dates import MinuteLocator
import pandas as pd

ndex = pd.date_range('2021-08-01 07:07:07', '2021-08-01 07:41:12', freq='1S', name='Time') 
df = pd.DataFrame(data=np.random.randint(1, 100, len(ndex)), index=ndex, columns=['A'])

And now we plot it:

fig, ax = plt.subplots()
df.plot(color='red', marker='x', lw=0, ms=0.2, ax=ax)

Which creates a plot without any complaints: time vs randint

Now I'd like to have minor ticks at every minute.

I've tried this:

ax.xaxis.set_minor_locator(MinuteLocator())

But that fails with OverflowError: int too big to convert


Solution

    • pandas.DataFrame.plot uses matplotlib as the default plotting backend, but it encodes date ticks as unix timestamps, which results in OverflowError: int too big to convert.
      • The default here is kind='line', but marker='x', lw=0, ms=0.2 are used in the OP to make a hacky scatter plot.
    • pandas.DataFrame.plot.scatter will work correctly.
    • Using matplotlib.pyplot.scatter will work as expected.
      • matplotlib: Date tick labels
        • Matplotlib date plotting is done by converting date instances into days since an epoch (by default 1970-01-01T00:00:00)
    • seaborn.scatterplot will also work:
      • sns.scatterplot(x=df.index, y=df.A, color='red', marker='x', ax=ax)
    • Tested in python 3.8.11, pandas 1.3.2, matplotlib 3.4.3, seaborn 0.11.2

    matplotlib.pyplot.scatter

    • The extra formatting has the effect of removing the month ('01') that would precede the time in the tick labels (e.g. '%m %H:%M').
    import matplotlib.dates as mdates
    import matplotlib.pyplot as plt
    
    fig, ax = plt.subplots(figsize=(25, 6))
    ax.scatter(x=df.index, y=df.A, color='red', marker='x')
    
    hourlocator = mdates.HourLocator(interval=1)  # adds some extra formatting, but not required
    majorFmt = mdates.DateFormatter('%H:%M')  # adds some extra formatting, but not required
    
    ax.xaxis.set_major_locator(mdates.MinuteLocator())
    
    ax.xaxis.set_major_formatter(majorFmt)  # adds some extra formatting, but not required
    _ = plt.xticks(rotation=90)
    

    enter image description here

    pandas.DataFrame.plot.scatter

    • Also pandas.DataFrame.plot with kind='scatter'
      • ax = df.reset_index().plot(kind='scatter', x='Time', y='A', color='red', marker='x', figsize=(25, 6), rot=90)
    # reset the index so Time will be a column to assign to x
    ax = df.reset_index().plot.scatter(x='Time', y='A', color='red', marker='x', figsize=(25, 6), rot=90)
    ax.xaxis.set_major_locator(mdates.MinuteLocator())
    

    enter image description here


    • Note the difference in the xticks produced by the two methods

    pandas.DataFrame.plot xticks

    ax = df.plot(color='red', marker='x', lw=0, ms=0.2, figsize=(25, 6))
    
    # extract the xticks to see the format
    ticks = ax.get_xticks()
    print(ticks)
    [out]:
    array([1627801627, 1627803672], dtype=int64)
    
    # convert the column to unix format to compare
    (df.index - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s')
    
    [out]:
    Int64Index([1627801627, 1627801628, 1627801629, 1627801630, 1627801631,
                1627801632, 1627801633, 1627801634, 1627801635, 1627801636,
                ...
                1627803663, 1627803664, 1627803665, 1627803666, 1627803667,
                1627803668, 1627803669, 1627803670, 1627803671, 1627803672],
               dtype='int64', name='Time', length=2046)
    

    matplotlib.pyplot.scatter xticks

    fig, ax = plt.subplots(figsize=(25, 6))
    ax.scatter(x=df.index, y=df.A, color='red', marker='x')
    
    ticks2 = ax.get_xticks()
    print(ticks2)
    
    [out]:
    array([18840.29861111, 18840.30208333, 18840.30555556, 18840.30902778,
           18840.3125    , 18840.31597222, 18840.31944444])