So I have columns of data, one of which is the average - which I want to show, and one of which is the std deviation - which I want the error bars of
I'm starting using altair, but I have quite a bit of data like this and would like to initially use it for altair.
setting up the dataframe:
nums = 6
testmean = [np.random.rand()*25+75 for _ in range(nums)]
teststdev = [np.random.rand()*5 for _ in range(nums)]
testspecies = ['oak', 'elm', 'willow']
testspecies = 2*(nums//len(testspecies))*testspecies
testfert = [True for _ in range(nums//2)] + [False for _ in range(nums//2)]
testfert = 2*testfert
testmetric = ['average' for _ in range(nums)] + ['stdev' for _ in range(nums)]
testnums = testmean + teststdev
testset = zip(testnums, testspecies, testfert, testmetric, strict=True)
testdf = pd.DataFrame(testset, columns = ['number', 'species', 'fertilizer', 'type'])
I created the bars like I wanted:
bars = alt.Chart(testdf.loc[testdf['type'] == 'average']).mark_bar().encode(
alt.X('fertilizer', title = None),
alt.Y('number', title = 'mean'),
color='fertilizer',
column='species',
tooltip=['species']
)
and this should work for the error bars:
error_bars = alt.Chart(testdf.loc[testdf['type'] == 'stdev').mark_errorbar(extent='ci').encode(
alt.X('fertilizer', title = None),
alt.Y('number'),
column='species',
tooltip=['species']
)
but I have no idea how to combine the two, and it looks like facet won't work as the data isn't in 'long form' and produces the error Faceted charts cannot be layered
when trying to layer it with bars+error_bars
:
alt.layer(bars, error_bars, data=testdf).facet(
column='species'
)
Grouped bar charts in Altair using two different columns introduced the concept of long form but doesn't really help answer the question in a way I could see.
As you pointed out, you can't layer faceted chart, but you can facet layered charts. So removing the column=
from the individual charts and only using it in the facet will work:
wide_df = testdf.pivot(index=['species', 'fertilizer'], columns='type', values='number').reset_index()
bars = alt.Chart().mark_bar().encode(
alt.X('fertilizer', title = None),
alt.Y('average', title='mean'),
color='fertilizer',
# column='species',
tooltip=['species']
)
error_bars = alt.Chart().mark_rule().encode(
alt.X('fertilizer', title=None),
alt.Y('errbar_min:Q'),
alt.Y2('errbar_max:Q'),
# column='species',
tooltip=['species']
).transform_calculate(
errbar_min = alt.datum.average - alt.datum.stdev / 2,
errbar_max = alt.datum.average + alt.datum.stdev / 2
)
alt.layer(bars, error_bars, data=wide_df).facet(
column='species'
)