Why my Altair plot is not showing the latest available date of my Data Frame. (Image of my df below for some context)
I've tried several plots, but the tooltip is revealing that is only contemplating the day before the latest available date. This means, that the latest date is 2020-12-05
but the plot is showing 2020-12-04
.
Finally, here's the code of my plot:
confirmed_daily = alt.Chart(df.reset_index()).mark_bar(size=2).encode(
alt.X('index:T', title=" "),
alt.Y('confirmed_daily:Q', title=" "),
tooltip=[alt.Tooltip('index:T', title="Fecha"), alt.Tooltip('confirmed:Q', title="Casos acumulados"), alt.Tooltip('confirmed_daily:Q', title="Nuevos Casos")]
).properties(
title={
"text":["Casos diarios de COVID-19 en Colima"],
"subtitle": ["Datos del 15 de marzo al 04 de diciembre de 2020.", " "]
},
width = 800,
height = 400
)
confirmed_daily.save("graphs/confirmed_daily.html")
confirmed_daily
What I'm failing to notice?
TL;DR – convert your date strings to pandas datetimes, and everything will work properly:
df.index = pd.to_datetime(df.index)
The issue, believe it or not, is because of a quirk of Javascript date parsing. Javascript parses partial date strings in UTC, but parses full ISO-8601 date strings in local time. You can observe this in your Javascript console (I'm running this on a computer set to PST):
> new Date('2020-12-05')
Fri Dec 04 2020 16:00:00 GMT-0800 (Pacific Standard Time)
> new Date('2020-12-05T00:00:00')
Sat Dec 05 2020 00:00:00 GMT-0800 (Pacific Standard Time)
Because your dates are specified as strings that contain only year, month, and date, they are parsed by the Vega-Lite (Javascript) renderer as UTC time, and so they display as the previous day because your computer is set to a timezone that is west of GMT. (Had you been running this code in, say, Eastern Europe or China, it would have worked as expected — isn't Javascript fun?)
Altair works around this "feature" of Javascript by ensuring that all date inputs are serialized as full ISO 8601 strings, but this applies only if those date inputs are actual pandas datetime types. For example, here are some dates specified as strings:
import altair as alt
import pandas as pd
df = pd.DataFrame({
'date': ['2020-12-01', '2020-12-02', '2020-12-03', '2020-12-04'],
'value': [2, 3, 1, 4],
})
alt.Chart(df).mark_bar().encode(
x='value:Q',
y='yearmonthdate(date):O',
)
Notice that the days are one off, because my computer is in the US/Pacific timezone, but the dates are parsed as UTC.
If you convert this column of strings to a column of Pandas datetime objects, the result is what you would expect:
df['date'] = pd.to_datetime(df['date'])
alt.Chart(df).mark_bar().encode(
x='value:Q',
y='yearmonthdate(date):O',
)
Convert your string dates to pandas datetimes, and this issue should not come up again.