Search code examples
python-3.xpandasbokeh

How do I create a Bokeh Select menu for a line plot for an indeterminate number of options?


I've been working on getting a select menu and Bokeh plot up and running on a dataset I'm working with. The dataset can be found here. I have no experience with JavaScript, but I believe my select menu isn't connected/-ing to my plot. Therefore, I have a plot outline, but no data displayed. As I run the script from the console with bokeh serve --show test.py, I get the first 7 notifications in my JS console. The last three (those in the red bracket in the screenshot) occur when I try and change to a different item in my select menu.

JS Errors

Goal: Display the plot of data for rows those id number ('ndc' in this example) is selected in the Select menu.

Here's my code (modified from this post) that I used to get started. This one was also used, as were a handful of others, and the Bokeh documentation itself.

import pandas as pd
from bokeh.io import curdoc, output_notebook, output_file
from bokeh.layouts import row, column
from bokeh.models import Select, DataRange1d, ColumnDataSource
from bokeh.plotting import figure

# output_notebook()
output_file('test.html')

def get_dataset(src, drug_id):
    src.drop('Unnamed: 0', axis = 1, inplace = True)
    df = src[src.ndc == drug_id].copy()
    df['date'] = pd.to_datetime(df['date'])
    df = df.set_index(['date'])
    df.sort_index(inplace=True)
    source = ColumnDataSource(data=df)
    return source


def make_plot(source, title):
    plot = figure(plot_width=800, plot_height = 800, tools="", x_axis_type = 'datetime', toolbar_location=None)

    plot.xaxis.axis_label = 'Time'
    plot.yaxis.axis_label = 'Price ($)'
    plot.axis.axis_label_text_font_style = 'bold'
    plot.x_range = DataRange1d(range_padding = 0.0)
    plot.grid.grid_line_alpha = 0.3 

    plot.title.text = title
    plot.line(x= 'date', y='nadac_per_unit', source=source)
    return plot


def update_plot(attrname, old, new):
    ver = vselect.value
    plot.title.text = "Drug Prices"
    src = get_dataset(df, ver)
    source.date.update(src.date)


df = pd.read_csv('data/plotting_data.csv')
ver = '54034808' #Initial id number
cc = df['ndc'].astype(str).unique() #select-menu options

vselect = Select(value=ver, title='Drug ID', options=sorted((cc)))

source = get_dataset(df, ver)
plot = make_plot(source, "Drug Prices")

vselect.on_change('value', update_plot)
controls = row(vselect)

curdoc().add_root(row(plot, controls))

Solution

  • There were some problems in your code:

    • You want to drop the Unnamed: 0 column. This can only be done once and when you try this again it will throw an error since this column does not exist anymore.
    • The way you tried to filter the dataframe didn't work and would result in an empty dataframe. You can select rows based on a column value like this: df.loc[df['column_name'] == some_value]
    • Updating the ColumnDataSource object can be done by replacing source.data with the new data.
    import pandas as pd
    from bokeh.io import curdoc, output_notebook, output_file
    from bokeh.layouts import row, column
    from bokeh.models import Select, DataRange1d, ColumnDataSource
    from bokeh.plotting import figure
    
    output_notebook()
    output_file('test.html')
    
    def get_dataset(src, drug_id):
        src.drop('Unnamed: 0', axis = 1, inplace = True)
        df = src.loc[src['ndc'] == int(drug_id)]
        df['date'] = pd.to_datetime(df['date'])
        df = df.set_index(['date'])
        df.sort_index(inplace=True)
        source = ColumnDataSource(data=df)
        return source
    
    
    def make_plot(source, title):
        plot = figure(plot_width=800, plot_height = 800, tools="", x_axis_type = 'datetime', toolbar_location=None)
        plot.xaxis.axis_label = 'Time'
        plot.yaxis.axis_label = 'Price ($)'
        plot.axis.axis_label_text_font_style = 'bold'
        plot.x_range = DataRange1d(range_padding = 0.0)
        plot.grid.grid_line_alpha = 0.3 
        plot.title.text = title
        plot.line(x= 'date', y='nadac_per_unit', source=source)
        return plot
    
    
    def update_plot(attrname, old, new):
        ver = vselect.value
        df1 = df.loc[df['ndc'] == int(new)]
        df1['date'] = pd.to_datetime(df1['date'])
        df1 = df1.set_index(['date'])
        df1.sort_index(inplace=True)
        newSource = ColumnDataSource(df1) 
        source.data = newSource.data
    
    
    df = pd.read_csv('data/plotting_data.csv')
    ver = '54034808' #Initial id number
    cc = df['ndc'].astype(str).unique() #select-menu options
    
    vselect = Select(value=ver, title='Drug ID', options=sorted((cc)))
    
    source = get_dataset(df, ver)
    plot = make_plot(source, "Drug Prices")
    
    vselect.on_change('value', update_plot)
    controls = row(vselect)
    
    curdoc().add_root(row(plot, controls))