TLDR; How do I make a grouped bar chart in the most recent version of Altair where the grouped bars come from different columns of quantitative data, as opposed to one column of categorical data?
While I've found some great answers on here about creating grouped bar charts in Altair (like this one), none answer my specific question.
I have a table with multiple columns, two of which are quantitative and represent two different values that could be grouped into one category (e.g. 'cm_of_rain' and 'cm_of_snow' can be summed and called something like 'cm_of_precipitation'), one is the months as ordinal strings, and another is the day as a number. So a dataframe of the data would look something like this:
data = {'Month':['Jan', 'Jan', 'Feb', 'Feb', 'Mar', 'Mar', 'Apr', 'Apr'],
'Day': [1, 15, 1, 15, 1, 15, 1, 15],
'cm_of_rain':[20, 21, 19, 18, 1, 12, 33, 12],
'cm_of_snow':[0, 2, 6, 3, 4, 2, 5 ,11]}
df = pd.DataFrame(data)
print(df)
Month Day cm_of_rain cm_of_snow
Jan 1 20 0
Jan 15 21 2
Feb 1 19 6
Feb 15 18 3
Mar 1 1 4
Mar 15 12 2
Apr 1 33 5
Apr 15 12 11
I want to make a bar plot where the data is grouped by month on the X axis and cm of precipitation is shown on the Y-axis, but rather than having a stacked bar plot where rain and snow are additive, I want to plot the two values as side-by-side bars for each month. So the result should look something like the grouped bar plot from the post linked above
except Genre ("Action", "Crime") would be replaced by Month ("Jan", "Feb", "Mar", "Apr"), Gender (F, M) would be replaced by Precipitation_Type (rain, snow), and Rating would be replaced by Precipitation_(cm).
For context, the main difference between my question and the ones asked by others before, is that the data I want to group together is from two different columns of quantitative data in my dataframe, whereas every other post I've seen uses some sort of categorical data from a single column.
What you have is usually referred to as "wide form" or "untidy" data. Altair generally works better with "long form" or "tidy data". You can read more about how to convert between the two in the documentation, but one way would be to use transform_fold
.
import altair as alt
import pandas as pd
data = {'Month':['Jan', 'Jan', 'Feb', 'Feb', 'Mar', 'Mar', 'Apr', 'Apr'],
'Day': [1, 15, 1, 15, 1, 15, 1, 15],
'rain':[20, 21, 19, 18, 1, 12, 33, 12],
'snow':[0, 2, 6, 3, 4, 2, 5 ,11]}
df = pd.DataFrame(data)
alt.Chart(df).mark_bar().encode(
x='amount (cm):Q',
y='type:N',
color='type:N',
row=alt.Row('Month', sort=['Jan', 'Feb', 'Mar', 'Apr'])
).transform_fold(
as_=['type', 'amount (cm)'],
fold=['rain', 'snow']
)