Search code examples
altair

Adding vertical average lines on top of a layered histogram in Altair


I am trying to add vertical lines indicating the average of datasets in a layered histogram in Altair (based on their example). My attempt below is failing:

base = alt.Chart(outcomes)

bar = base.transform_fold(
    ['Push','Dealer Win','Player Win','Ace High Push'],
    as_=['Outcome','Outcomes out of 1000']
).mark_bar(
    opacity=0.3,
    binSpacing=0
).encode(
    alt.X('Outcomes out of 1000:Q', bin=alt.Bin(maxbins=100)),
    alt.Y('count()', stack=None),
    alt.Color('Outcome:N')
)

rule = base.transform_fold(
    ['Push','Dealer Win','Player Win','Ace High Push'],
    as_=['Count','Outcome']
).mark_rule(
    color='red'
).encode(
    alt.X('mean(Outcome):Q'),
    size=alt.value(2)
)

bar + rule

which results in:

empty graph

When I do just bar though the layered histogram renders just fine:

enter image description here

Basically what I'm looking for is:

enter image description here

Thanks🙏

Update (less than an hour after original post):

Thanks @debbes for the speedy guidance! I was able to use your example to get this working via:

base = alt.Chart(outcomes).transform_fold(
    ['Push','Dealer Win','Player Win','Ace High Push'],
    as_=['Outcome','Outcomes out of 1000']
).transform_bin(
    field='Outcomes out of 1000',
    as_='bin_meas',
    bin=alt.Bin(maxbins=100)
).encode(
    color='Outcome:N'
)

hist = base.mark_bar(
    opacity=0.3,
    binSpacing=0
).encode(
    alt.X('bin_meas:Q'),
    alt.Y('count()', stack=None)
)

rule = base.mark_rule(
    size=2
).encode(
    alt.X('mean(Outcomes out of 1000):Q')
)

hist + rule

which results in:

transform_bin_solution


Solution

  • In this case you have to use the transform_bin instead of doing the binning in the X encoding:

    base = alt.Chart(source).transform_fold(
        ['Trial A', 'Trial B', 'Trial C'],
        as_=['Experiment', 'Measurement']
    ).transform_bin(
        field='Measurement',
        as_='bin_meas',
        bin=alt.Bin(maxbins=100)
    ).encode(
        color='Experiment:N'
    )
    
    hist = base.mark_bar(opacity=0.3,binSpacing=0).encode(
        alt.X('bin_meas:Q'),
        alt.Y('count()', stack=None),
    )
    
    rule = base.mark_rule(size=2).encode(alt.X('mean(Measurement):Q'),)
    
    hist + rule
    

    plot