Search code examples
pythondatetimematplotliblocationxticks

matplotlib xticks outputs wrong array


I am trying to plot a time series, which looks like this

ts
2020-01-01 00:00:00    1300.0
2020-01-01 01:00:00    1300.0
2020-01-01 02:00:00    1300.0
2020-01-01 03:00:00    1300.0
2020-01-01 04:00:00    1300.0
                        ...  
2020-12-31 19:00:00    1300.0
2020-12-31 20:00:00    1300.0
2020-12-31 21:00:00    1300.0
2020-12-31 22:00:00    1300.0
2020-12-31 23:00:00    1300.0
Freq: H, Name: 1, Length: 8784, dtype: float64

And I plot it via: ts.plot(label=label, linestyle='--', color='k', alpha=0.75, zorder=2)

If the time series ts starts from 2020-01-01 to 2020-12-31, I get following when I call plt.xticks()[0]:

array([438288, 439032, 439728, 440472, 441192, 441936, 442656, 443400,
       444144, 444864, 445608, 446328, 447071], dtype=int64)

which is fine since the first element of that array actually shows the right position of the first xtick. However when I expand the time series object from 2019-01-01 to 2020-12-31, so over 2 years, when I call the plt.xticks()[0], I get following:

array([429528, 431688, 433872, 436080, 438288, 440472, 442656, 444864,
       447071], dtype=int64)

I don't understand why now I am getting less values as xticks. So for 12 months I am getting 13 locations for xticks. But for 24 months I was expecting to get 25 locations. Instead I got only 9. How would I get all of these 25 locations?

This is the whole script:

fig, ax = plt.subplots(figsize=(8,4))
ts.plot(label=label, linestyle='--', color='k', alpha=0.75, zorder=2)
locs, labels = plt.xticks()

Solution

  • Matplotlib automatically selects an appropriate number of ticks and tick labels so that the x-axis does not become unreadable. You can override the default behavior by using tick locators and formatters from the matplotlib.dates module.

    But note that you are plotting the time series with the pandas plot method which is a wrapper around plt.plot. Pandas uses custom tick formatters for time series plots that produce nicely-formatted tick labels. By doing so, it uses x-axis units for dates that are different from the matplotlib date units, which explains why you get what looks like a random number of ticks when you try using the MonthLocator.

    To make the pandas plot compatible with matplotlib.dates tick locators, you need to add the undocumented x_compat=True argument. Unfortunately, this also removes the pandas custom tick label formatters. So here is an example of how to use a matplotlib date tick locator with a pandas plot and get a similar tick format (minor ticks not included):

    import pandas as pd                # v 1.1.3
    import matplotlib.pyplot as plt    # v 3.3.2
    import matplotlib.dates as mdates
    
    # Create sample time series stored in a dataframe
    ts = pd.DataFrame(data=dict(constant=1),
                      index=pd.date_range('2019-01-01', '2020-12-31', freq='H'))
    
    # Create pandas plot
    ax = ts.plot(figsize=(10,4), x_compat=True)
    ax.set_xlim(min(ts.index), max(ts.index))
    
    # Select and format x ticks
    ax.xaxis.set_major_locator(mdates.MonthLocator())
    ticks = pd.to_datetime(ax.get_xticks(), unit='d') # timestamps of x ticks
    labels = [timestamp.strftime('%b\n%Y') if timestamp.year != ticks[idx-1].year
              else timestamp.strftime('%b') for idx, timestamp in enumerate(ticks)]
    plt.xticks(ticks, labels, rotation=0, ha='center');
    

    pandas_time_series