Search code examples
python-3.xaltairvega-lite

Altair's selection and transform_filter via binding_range slider for datetime values doesn't seem to work with equality condition or selector itself


I wanted to bind a range slider with datetime values to filter a chart for data for a particular date only. Using the stocks data, what I want to do is have the x-axis show the companies and y-axis the price of the stocks for a particular day which the user selects via a range slider.

Based on inputs from this answer and this issue I have the following code which shows something when the slider is moved around after one particular value (with the inequality condition in transform_filter), but is empty for the rest. What is peculiar is that if I have inequality operator then at least it shows something, but everything is empty when its ==.

import altair as alt
from vega_datasets import data

source = data.stocks()

def timestamp(t):
  return pd.to_datetime(t).timestamp()

slider = alt.binding_range(step=86400, min=timestamp(min(source['date'])), max=timestamp(max(source['date']))) #86400 is the difference b/w consequetive days

select_date = alt.selection_single(fields=['date'], bind=slider, init={'date': timestamp(min(source['date']))})

alt.Chart(source).mark_bar().encode(
    x='symbol',
    y='price',
).add_selection(select_date).transform_filter(alt.datum.date == select_date.date)

Since the output is empty I am inclined to conclude that it's the transform_filter that is causing issues, but I have been at it for more than 6 hours now and tried all the permutation and combinations of using alt.expr.toDate and other conversions here and there but I cannot get it to work.

Also tried just transform_filter(select_date.date) and transform_filter(date) along with other things but nothing quite works.

The expected output is that, the heights of bars change(due to data being filtered on date) as the user drags the slider.

Any help would be really appreciated.


Solution

  • There are several issues here:

    • In Vega-Lite, timestamps are expressed in milliseconds, not seconds
    • You are filtering on equality between a numerical timestamp and a string representation of a date.
    • Even if you parse the date in the filter expression, Python date parsing and Javascript date parsing behave differently and the results will generally not match. Even within javascript, the date parsing behavior can vary from browser to browser; all this means that filtering on equality of a Python and Javascript timestamp is generally problematic
    • The data you are using has monthly timestamps, so the slider step should account for this

    Keeping all that in mind, the best course would probably be to adjust the slider values and filter on matching year and month, rather than trying to achieve equality in the exact timestamp. The result looks like this:

    import altair as alt
    from vega_datasets import data
    import pandas as pd
    
    source = data.stocks()
    
    def timestamp(t):
      return pd.to_datetime(t).timestamp() * 1000
    
    slider = alt.binding_range(
        step=30 * 24 * 60 * 60 * 1000, # 30 days in milliseconds
        min=timestamp(min(source['date'])),
        max=timestamp(max(source['date'])))
    
    select_date = alt.selection_single(
        fields=['date'],
        bind=slider,
        init={'date': timestamp(min(source['date']))},
        name='slider')
    
    alt.Chart(source).mark_bar().encode(
        x='symbol',
        y='price',
    ).add_selection(select_date).transform_filter(
        "(year(datum.date) == year(slider.date[0])) && "
        "(month(datum.date) == month(slider.date[0]))"
    )
    

    You can view the result here: vega editor.