I am trying to create a condition involving an aggregate field. For this example dataset
df=pd.DataFrame([['game1','player1',2,1],['game1','player2',3,4],['game1','player3',2,2]
,['game2','player1',0,3],['game2','player2',4,4],['game2','player3',3,3]]
,columns=['game','player','score1','score2'])
color={'condition':[{"value":"green","test":"datum.score2 > datum.score1"}
,{"value":"yellow","test":"datum.score2 == datum.score1"}
,{"value":"red","test":"datum.score2 < datum.score1"}]}
alt.Chart(df).mark_point().encode(x='score2',y='player',color=color)
I get this chart:
But if I wanted to have a chart displaying only the average for each player, I couldn't figure out a syntax that worked for the condition.
alt.Chart(df).mark_point().encode(x='mean(score2)',y='player',color=color)
I tried:
"test":mean(datum.score2) > mean(datum.score1)"
and
"test":"datum.mean(score2) > datum.mean(score1)"
None of them worked. I couldn't find any syntax directions in the documentation.
mean()
is a shorthand in Altair that is available in encoding fields and transforms but not directly in conditions. To use the mean values in a condition, you need to create new columns for the mean values in a separate step via transform_aggregate
(here we use transform_joinaggregate
since you want to plot the original values in your dataframe and not the aggregated values):
color={
'condition': [
{"value":"green", "test": "datum.mean_score2 > datum.mean_score1"},
{"value":"yellow", "test": "datum.mean_score2 == datum.mean_score1"},
{"value":"red", "test": "datum.mean_score2 < datum.mean_score1"}
]
}
alt.Chart(df).mark_point().encode(
x='score2',
y='player',
color=color
).transform_joinaggregate(
mean_score1='mean(score1)',
mean_score2='mean(score2)',
groupby=['player']
)
If you want to plot the mean values, it would look like this:
alt.Chart(df).mark_point().encode(
x='mean_score2:Q',
y='player',
color=color
).transform_aggregate(
mean_score1='mean(score1)',
mean_score2='mean(score2)',
groupby=['player']
)