Search code examples
pythonmatplotlibx-axis

Incorrect timestamps are shown on the x-axis


I have the following bar plot being generated with the following code:

import matplotlib as mplt
from matplotlib import dates, pyplot
from matplotlib.transforms import ScaledTranslation
import numpy as np
import pandas as pd

ts = pd.date_range('2023/01/01', '2023/01/06', freq='3H', tz='utc')
xs = np.arange(len(ts))
df = pd.DataFrame({'date':ts,'value':np.ones(shape=len(ts)), 'intensity':np.random.uniform(0, 10, len(ts))})

colors = []
for i in df.intensity:
    if 0 <= i < 6:
        colors.append('#75FF71')
    elif 6 <= i < 8:
        colors.append('#FFC53D')
    else:
        colors.append('#FF5C5C')

# pyplot.box
fig, ax = pyplot.subplots(figsize = (24,1), constrained_layout=False)
ax.yaxis.set_ticklabels(labels=[])
ax.yaxis.set_visible(False)
ax.grid(False)
ax.set_frame_on(False)

hour_locs = dates.HourLocator(byhour=[6, 12, 18])
hour_locs_fmt = dates.DateFormatter('%H:%M')
ax.xaxis.set_minor_locator(hour_locs)
ax.xaxis.set_minor_formatter(hour_locs_fmt)

day_locs = dates.DayLocator(interval=1)
day_locs_fmt = dates.DateFormatter('%B %-d')
ax.xaxis.set_major_locator(day_locs)
ax.xaxis.set_major_formatter(day_locs_fmt)
ax.xaxis.set_tick_params(which='major', pad=-10, length=40)
ax.bar(df.date, df.value, color=colors)

offset = ScaledTranslation(1.6, 0, fig.dpi_scale_trans)
for label in ax.xaxis.get_majorticklabels():
    label.set_transform(label.get_transform() + offset)

The output: Output

The timestamps start from 2023/01/01 00:00:00+000 (UTC), however the plot shows that the data is starting at ~15:00 the day before. I'm assuming that matplotlib is ignoring the timezone in the data.

I did try specifying TZ in the locators and formatter in vain.

How do I get matplotlib to plot in UTC?


Solution

  • As you are plotting bars, it adds a little extra space on either ends. This is the reason you are seeing the extra space. At the end, add this line (ax.set_xlim(...)), forcing the plot to start and end at the times you want.

    ax.set_xlim(df.date.min(), df.date.max())
    

    You will get the below plot... hope this is what you are looking for...

    enter image description here

    EDIT: I realize that the last date (Jan 6th) is also showing up. In case you don't want that, you can replace the date.max() by df.date.nlargest(2).iloc[-1], which will give the previous value and not show the last date. However, it will also remove the last vertical line and any time if it is available... your call on which one you would prefer... below is the other plot.

    enter image description here

    EDIT #2: To remove the last label (June 6) AND keep the last major tick line, you can use the first option of having June 6th and then remove the last ticklabel. Note that this is very specific to your case where you have the last data as 00 hours of a day and may not work if you change the data...

    ## Code to add before pyplot.show() ##
    ax.set_xlim(df.date.min(), df.date.max())
    x_ticks = ax.xaxis.get_major_ticks()  ## Get all ticks
    x_ticks[-1].label1.set_visible(False) ## Remove last tick label
    

    Updated plot enter image description here