Search code examples
pythongoogle-mapsredisbokeh

How to cache bokeh plots using redis


I'm using bokeh server to render a timeseries graph over a map. As the timeseries progresses, the focus of the map moves.

The code below works, but each progression creates a call that goes off to the google api (GMAP) to get the backdrop. This then takes time to render. At points where the timeseries has shifted the focus a few times in quick succession, the backdrop hasn't had time to render before it is updated.

I've been trying to work out if/how these requests can be made in advance, cached (using redis), enabling the user is able to view the cache with all data already loaded for each tick on the timeseries.

main.py

import settings
from bokeh.plotting import figure, gmap
from bokeh.embed import components
from bokeh.models import CustomJS, ColumnDataSource, Slider, GMapOptions, GMapPlot, Range1d, Button
from bokeh.models.widgets import DataTable, TableColumn, HTMLTemplateFormatter
from bokeh.layouts import column, row, gridplot, layout
from bokeh.io import show, export_png, curdoc

from filehandler import get_graph_data


"""
Get arguments from request
"""
try:
    args = curdoc().session_context.request.arguments
    pk = int(args.get('pk')[0])
except:
    pass
"""
get data for graph from file and initialise variables
"""
#load data into dictionary from file referenced by pk
data_dict = get_graph_data(pk)

no_of_markers = data_dict.get('markers') 
length_of_series = data_dict.get('length')
series_data = data_dict.get('data') #lat/lon position of each series at each point in time
series_names = series_data.get('series_names') #names of series
range_x_axis = data_dict.get('xaxis') #min/max lat co-ords
range_y_axis = data_dict.get('yaxis') #min/max lon co-ords


"""
Build data
"""
graph_source = ColumnDataSource(series_data)

"""
Build markers to show current location
"""
markers = ColumnDataSource(data=dict(lon=[], lat=[]))

"""
Build mapping layer
"""
def create_map_backdrop(centroid, zoom, tools):
    """
    Create the map backdrop, centered on the starting point
    Using GoogleMaps api
    """
    map_options = GMapOptions(lng=centroid[1],
                              lat=centroid[0],
                              map_type='roadmap',
                              zoom=zoom,
                              )

    return gmap(google_api_key=settings.MAP_KEY,
                map_options=map_options,
                tools=tools,
                )

#set map focus
centroid = (graph_source.data['lats'][0][0],
            graph_source.data['lons'][0][0],
            )


"""
Build Plot
"""

tools="pan, wheel_zoom, reset"
p = create_map_backdrop(centroid, 18, tools)
p.multi_line(xs='lons',
             ys='lats',
             source=graph_source,
             line_color='color',
             )
p.toolbar.logo = None
p.circle(x='lon', y='lat', source=markers)


"""
User Interactions
"""

def animate_update():
    tick = slider.value + 1
    slider.value = tick

def slider_update(attr, old, new):
    """
    Updates all of the datasources, depending on current value of slider
    """
    start = timer()
    if slider.value>series_length:
        animate()
    else:
        tick = slider.value
        i=0
        lons, lats = [], []
        marker_lons, marker_lats = [], []

        while i < no_of_markers:

            #update lines
            lons.append(series_data['lons'][i][0:tick])
            lats.append(series_data['lats'][i][0:tick])

            #update markers
            marker_lons.append(series_data['lons'][i][tick])
            marker_lats.append(series_data['lats'][i][tick])

            #update iterators
            i += 1

        #update marker display
        markers.data['lon'] = marker_lons
        markers.data['lat'] = marker_lats

        #update line display
        graph_source.data['lons'] = lons
        graph_source.data['lats'] = lats    

        #set map_focus
        map_focus_lon = series_data['lons'][tick]
        map_focus_lat = series_data['lats'][tick]

        #update map focus
        p.map_options.lng = map_focus_lon
        p.map_options.lat = map_focus_lat



slider = Slider(start=0, end=series_length, value=0, step=5)
slider.on_change('value', slider_update)
callback_id = None

def animate():
    global callback_id
    if button.label == "► Play":
        button.label = "❚❚ Pause"
        callback_id = curdoc().add_periodic_callback(animate_update, 1)

    else:
        button.label = "► Play"
        curdoc().remove_periodic_callback(callback_id)

button = Button(label="► Play", width=60)
button.on_click(animate)

"""
Display plot
"""


grid = layout([[p, data_table],
                [slider, button],
                ])

curdoc().add_root(grid)

I've tried caching the plot data (p), but it looks like this is persisted before the call to the google api is made.

I've explored caching the map tiles direct from the api and then stitching them into the plot as a background image (using bokeh ImageURL), but I can't get ImageUrl to recognise the in-memory image.

The server documentation suggests that redis can be used as a backend so I wondered whether this might speed thing up, but when I try to start it bokeh serve myapp --allow-websocket-origin=127.0.0.1:5006 --backend=redis I get --backend is not a recognised command.

Is there a way to either cache the fully rendered graph (possibly the graph document itself), whilst retaining the ability for users to interact with the plot; or to cache the gmap plot once it has been rendered and then add it to the rest of the plot?


Solution

  • If this was standalone Bokeh content (i.e. not a Bokeh server app) then you serialize the JSON representation of the plot with json_items and re-hydrate it explicitly in the browser with Bokeh.embed_items. That JSON could potentially be stored in Redis, and maybe that would be relevant. But a Bokeh server is not like that. After the initial session creation, there is never any "whole document" to store or cache, just a sequence of incremental, partial updates that happen over a websocket protocol. E.g. the server says "this specific data source changed" and the browser says "OK I should recompute bounds and re-render".

    That said, there are some changes I would suggest.

    The first is that you should not update CDS columns one by one. You should not do this:

    # BAD
    markers.data['lon'] = marker_lons
    markers.data['lat'] = marker_lats
    

    This will generate two separate update events and two separate re-render requests. Apart from the extra work this causes, it's also the case that the first update is guaranteed to have mismatched old/new coordinates. Instead, you should always update CDS .data dict "atomically", in one go:

    source.data = new_data_dict
    

    Addtionally, you might try curdoc().hold to collect updates into fewer events.