Search code examples
python-3.xdata-visualizationvisualizationaltairvega-lite

How to ignore or clip negative values in altair charts from the chart code itself?


I want to NOT show the negative value in the bar chart. The main idea is to NOT have that y-axis offset(in the actual problem its a facet), so any way to achieve this is fine - maybe clipping - just not at data level, preferably from the chart itself.

I thought of using alt.Scale but the domain requires you to specify a max limit and the issue is that I do not know that first hand, and I cannot find a way to programmatically specify max over the values.

You can use the following demo chart -

import pandas as pd
import altair as alt

dd = pd.DataFrame({'a': [0,1,2,3,4,5], 'b': [10,14, -5, 15, 0, 5]})
a = alt.Chart().mark_bar().encode(
    x='a',
    y=alt.Y('b:Q')
)
b = alt.Chart().mark_line().transform_window(
    rolling_mean = 'mean(b)',
    frame=[-2, 0]).encode(
    x='a',
    y='rolling_mean:Q'
)
alt.layer(a, b, data=dd)

Solution

  • There are only two ways I know of to hide data on a chart. First, you can set an explicit scale domain and set clip=True for the relevant marks:

    import pandas as pd
    import altair as alt
    
    dd = pd.DataFrame({'a': [0,1,2,3,4,5], 'b': [10,14, -5, 15, 0, 5]})
    a = alt.Chart().mark_bar(clip=True).encode(
        x='a',
        y=alt.Y('b:Q', scale=alt.Scale(domain=[0, 16]))
    )
    b = alt.Chart().mark_line().transform_window(
        rolling_mean = 'mean(b)',
        frame=[-2, 0]).encode(
        x='a',
        y='rolling_mean:Q'
    )
    alt.layer(a, b, data=dd)
    

    enter image description here

    Second, you can apply a filter transform to your data to remove rows from your dataset:

    import pandas as pd
    import altair as alt
    
    dd = pd.DataFrame({'a': [0,1,2,3,4,5], 'b': [10,14, -5, 15, 0, 5]})
    a = alt.Chart().mark_bar().encode(
        x='a',
        y=alt.Y('b:Q', scale=alt.Scale(domain=[0, 16]))
    )
    b = alt.Chart().mark_line().transform_window(
        rolling_mean = 'mean(b)',
        frame=[-2, 0]).encode(
        x='a',
        y='rolling_mean:Q'
    )
    alt.layer(a, b, data=dd).transform_filter(alt.datum.b > 0)
    

    enter image description here

    Note that difference: because this transform was applied at the top level, it removes rows for both sub-panels. If you instead apply the filter for only one of the subcharts, the rows will only be removed from that layer:

    import pandas as pd
    import altair as alt
    
    dd = pd.DataFrame({'a': [0,1,2,3,4,5], 'b': [10,14, -5, 15, 0, 5]})
    a = alt.Chart().transform_filter(
        alt.datum.b > 0
    ).mark_bar().encode(
        x='a',
        y=alt.Y('b:Q', scale=alt.Scale(domain=[0, 16]))
    )
    b = alt.Chart().mark_line().transform_window(
        rolling_mean = 'mean(b)',
        frame=[-2, 0]).encode(
        x='a',
        y='rolling_mean:Q'
    )
    alt.layer(a, b, data=dd)
    

    enter image description here