Search code examples
python-3.xdata-visualizationvisualizationaltairvega-lite

Is there a way to select or highlight last or first "n" data points in Altair?


One of the things I have found wanting lately is the ability to highlight or select just the last n data points in Altair. For example, for a daily updated time series data, selecting/highlighting the last 7 days data window.

The issue with condition is that you have to explicitly specify the date or a value, from which the selection/highlight happens. One drawback of this is that in case of a time series data that updates fairly frequently, it becomes a manual task.

One possible solution is to just use native Python and if the x axis is datetime data, then write the code in such a way that it programmatically takes care of things perhaps using f-strings.

I was wondering, despite these two solutions above, is there a way natively built into Altair/Vega-Lite to select the last/first n data points?

A contrived example using f-strings -

index = 7 #a perhaps bad way to highlight last 2 data points
data = pd.DataFrame({'time':[0,1,2,3,4,5,6,7,8,9], 'value':[1,2,4,8,16,15,14,13,12,11]})

bar = alt.Chart(data).mark_bar(opacity=1, width=15).encode(
    x='time:T',
    y='value:Q',
    color = alt.condition(alt.datum.time>f'{index}', alt.value('red'), alt.value('steelblue'))
)

text = bar.mark_text(align='center', dy=-10).encode(
    text='value:Q'
)

bar+text

enter image description here


Solution

  • You can do this using a window transform, in a similar way to the Top-K Items example:

    import altair as alt
    import pandas as pd
    
    data = pd.DataFrame({'time':[0,1,2,3,4,5,6,7,8,9], 'value':[1,2,4,8,16,15,14,13,12,11]})
    num_items = 2
    
    base = alt.Chart(data).transform_window(
        rank='rank()',
        sort=[alt.SortField('time', order='descending')]
    )
    
    bar = base.mark_bar(opacity=1, width=15).encode(
        x='time:T',
        y='value:Q',
        color = alt.condition(alt.datum.rank<=num_items, alt.value('red'), alt.value('steelblue'))
    )
    
    text = bar.mark_text(align='center', dy=-10).encode(
        text='value:Q'
    )
    
    bar+text
    

    enter image description here