Search code examples
pythoninteractivevega-litealtair

Interactive plot selection in Altair does not hi-light points


I am trying to generate 2 plots in Altair that share the same selection.

I would like to plot scatter and bar charts of population (y) vs Age (x). I am using the Altair built-in dataset population. The population is the sum of the people column in this dataset. The dataset has columns for year, people, age and sex. I can get total populate using sum(people) and plot this as y against age. For the bar chart, I can plot similarly sum(people) versus age and color using the sex column.

I am trying to set up a brush/selection between these 2 plots so that I can hilight in the scatter plot and simultaneously the bar plot is updated to reflect that selection. However, I am stuck with the following problem

I am using the layered bar graph example from the Altair documentation for the example.

Here is the code

import altair as alt
from altair.expr import datum, if_
from vega_datasets import data
interval = alt.selection_interval(encodings=['x', 'y'])

df = data.population.url

scatter = alt.Chart(df).mark_point().encode(
    alt.X('age:O', axis=alt.Axis(title='')),
    y='sum(people)',
    color=alt.condition(interval, 'sex:N', alt.value('lightgrey'))
).transform_filter(
    filter = datum.year == 2000
).transform_calculate(
    "sex", if_(datum.sex == 2, 'Female', 'Male')
).properties(
    selection=interval
)

bar = alt.Chart(df).mark_bar(opacity=0.7).encode(
    alt.X('age:O', scale=alt.Scale(rangeStep=17)),
    alt.Y('sum(people)', stack=None),
    color=alt.condition(interval, 'sex:N', alt.value('lightgrey')),
).transform_filter(
    filter = datum.year == 2000
).transform_calculate(
    "sex", if_(datum.sex == 2, 'Female', 'Male')
).properties(height=100, width=400)

scatter & bar

I have modified the code in the documentation example. I am first creating a scatter plot and then using the color based on the selection. Then I define a bar plot of the same 2 columns and again use the selection to specify the color. Here is the output

enter image description here

Now, I would like to drag a box across the top (scatter) plot to select some points and simultaneously the bottom (bar) chart should update based on the selection. When I drag in the top plot to make my selection, this happens

enter image description here

Problems

  1. After dragging to make a selection in the top plot, the colors (inside and outside the selection) in both plots are changed to lightgrey. I expected, in both plots, inside the selection/brush to be hilighted but outside should be lightgrey.

How can I get a selection that is hilighted in both the top and bottom plots simultaneously?

EDIT

I want this behaviour, where a brush/selection in one plot is simultaneously hilighted in a 2nd (linked) plot.

Package versions:

Python = 3.6
Altair = 2.2
Jupyter = 5.6

Solution

  • To trigger a selection on an aggregated value, the best approach is to use an aggregate transform to define that quantity so that it is available to the entire chart.

    Here is an example:

    import altair as alt
    from altair.expr import datum, if_
    from vega_datasets import data
    
    interval = alt.selection_interval(encodings=['x', 'y'])
    
    
    base = alt.Chart(data.population.url).transform_filter(
        filter = datum.year == 2000
    ).transform_calculate(
        "sex", if_(datum.sex == 2, 'Female', 'Male')
    ).transform_aggregate(
        population="sum(people)",
        groupby=['age', 'sex']
    )
    
    scatter = base.mark_point().encode(
        alt.X('age:O', title=''),
        y='population:Q',
        color=alt.condition(interval, 'sex:N', alt.value('lightgrey'))
    ).properties(
        selection=interval
    )
    
    bar = base.mark_bar(opacity=0.7).encode(
        alt.X('age:O', scale=alt.Scale(rangeStep=17)),
        alt.Y('population:Q'),
        color=alt.condition(interval, 'sex:N', alt.value('lightgrey')),
    ).properties(height=100, width=400)
    
    scatter & bar
    

    Note that I took away the filtering by the interval selection on the lower plot, because that's not the behavior you described.