I have a dataset df
that contains a computed frequency curve that looks like
aep flow variance n-day
0 0.001 64480.8 0.01190750 01-day
1 0.002 56995.7 0.00925476 01-day
2 0.005 47984.8 0.00633636 01-day
3 0.01 41772.4 0.00456081 01-day
4 0.02 36024.0 0.00314372 01-day
5 0.05 29043.5 0.00179256 01-day
6 0.1 24145.0 0.00113570 01-day
7 0.2 19466.1 0.00075381 01-day
8 0.5 13215.4 0.00055517 01-day
9 0.8 9261.1 0.00054307 01-day
10 0.9 7785.4 0.00066750 01-day
11 0.95 6787.7 0.00094589 01-day
Where df.aep
is an annual exceedance probability, df.flow
is observed streamflow. I am interested in making a flood frequency plot which commonly has a log-scaled y-axis and and an x-axis scale is a reversed normal axis.
For this chart spec:
alt.Chart(df).transform_calculate(
normal = 'quantileNormal(datum.aep)').mark_line().encode(
x = alt.X('aep:Q', axis = alt.Axis(format = '%'), scale = alt.Scale( reverse=True)),
y = alt.Y('flow:Q', scale = alt.Scale(type='log')),
color = 'n-day')
I am getting a plot that looks like:
Ultimately, I am more interested in a plot that looks uses the transform quantileNormal
:
alt.Chart(df).transform_calculate(
normal = 'quantileNormal(datum.aep)').mark_line().encode(
x = alt.X('normal:Q', scale = alt.Scale( reverse=True)),
y = alt.Y('flow:Q', scale = alt.Scale(type='log')),
color = 'n-day')
But now I am at the point where I need the scaling provided by quantileNormal(datum.aep)
which are z-scores, with the labels from df.aep.
df.aep
as the plotted y-values?So I am thinking I can set up a labelExpr to map in the labels using:
from altair_transform import extract_data
aa = extract_data(a)
aa.normal = aa.normal.round(3)
bb = aa[['normal']]
bb.index = list('abcdefghijkl')
bb = pd.Series(index = bb.index, data = bb.values.flatten())
cc = aa[['aep']]
cc.index = list('abcdefghijkl')
cc = pd.Series(index = cc.index, data = cc.values.flatten())
s = ''
for norm, aep in zip(bb.iteritems(), cc.iteritems()):
s += f"datum.normal == {norm[1]} ? '{aep[1]:.3f}' : "
s += 'null'
where s
evaluates as:
"datum.normal == -3.09 ? '0.001' : datum.normal == -2.878 ? '0.002' : datum.normal == -2.576 ? '0.005' : datum.normal == -2.326 ? '0.010' : datum.normal == -2.054 ? '0.020' : datum.normal == -1.645 ? '0.050' : datum.normal == -1.282 ? '0.100' : datum.normal == -0.842 ? '0.200' : datum.normal == 0.0 ? '0.500' : datum.normal == 0.842 ? '0.800' : datum.normal == 1.282 ? '0.900' : datum.normal == 1.645 ? '0.950' : null"
Then, If I change my chart spec to:
a = alt.Chart(aa).mark_line().encode(
x = alt.X('normal:Q',
axis = alt.Axis(
values = [bb.a, bb.b, bb.c, bb.d, bb.e, bb.f, bb.g, bb.h, bb.i, bb.j, bb.k, bb.l],
labelExpr=s
),
scale = alt.Scale( reverse=True)),
y = alt.Y('flow:Q', scale = alt.Scale(type='log')),
color = 'n-day')
I get a chart with null labels. Can labelExpr
be used like this?
I believe you need to use datum.label
or datum.value
rather than datum.normal
to compare against the value of the label:
import altair as alt
import pandas as pd
alt.Chart(pd.DataFrame({'x': [3.21, 1.23, 4.56], 'y': [1, 2, 3]})).mark_point().encode(
x=alt.X(
'x',
axis=alt.Axis(
values=[1, 2, 3],
labelExpr='datum.label == 1 ? "label" : datum.value == 2 ? "value" : datum.x == 3 ? "x": null')),
y='y'
)