Search code examples
bokeh

Bokeh data is visualized only after script has ended


My script is running through some sequential data generation steps that will finally produce a series of graphs. I want after each step to visualize the graphs that are available so far, but I only get a visualization after the script has completely finished.

This is my code:

#!/usr/bin/python3                                                                             
from bokeh.plotting import curdoc, figure                                                      
import numpy as np                                                                             
                                                                                               
def plot_init(title):                                                                          
    global plot_circle                                                                         
    p = figure(title=title)                                                                    
    plot_circle = p.circle([],[])                                                              
    curdoc().add_root(p)                                                                       
                                                                                               
                                                                                               
def plot(data):                                                                                
    global plot_circle                                                                         
    plot_circle.data_source.stream(data)                                                       
    input("Continue...")                                                                       
                                                                                               
                                                                                               
plot_init("graph 1")
temp_x = np.random.rand(10)                                                                    
temp_y = np.random.rand(10)                                                                    
plot({'x': temp_x, 'y': temp_y})                                                               
                                                                                               
                                                                                               
plot_init("graph 2")                                                                           
temp_x = np.random.rand(10)                                                                    
temp_y = np.random.rand(10)                                                                    
plot({'x': temp_x, 'y': temp_y})         

I start the bokeh server like this: bokeh serve bokeh-test.py

This is what happens:

  1. I launch the server, console is waiting
  2. I open the url in the browser, the script starts and "Continue..." is written on the console. Nothing is displayed yet, the browser is still loading.
  3. I press a key, the scripts continues and "Continue..." is written on the console. Nothing is displayed yet, the browser is still loading.
  4. I press a key, the script finishes. Now the browser is displaying the two graphs that were generated.

I'd expected after step 2. that the first graph would be displayed and after step 3 that the second graph would be added.

Most examples for 'streaming' data involve add_periodic_callback(), but I don't see how this would fit in my application, because I don't handle data that is streaming in from some external data source, I'm just generating data in the flow of my script.

How can I visualize my data with incremental results?

Best regards, Vic


Solution

  • My problem was that curdoc() does not seem to communicate to the client right away when it's called:

    • if curdoc() is called in the main flow of the script, it is only effective when the script reaches the end
    • if curdoc() is called in a callback, it is only effective when the callback returns

    So I constructed my script such that the data generation flow is in a separate thread and a series of data handling callbacks is triggered using buttons, to do the drawing. The callbacks wait for the data to be generated using a semaphore and likewise the data generation waits for the user pressing the button using another semaphore.

    Perhaps I made it overly complicated, but it works exactly as I want. I have a single flow of data generation and I can after each step inspect the graph and push a button to continue with the next data generation step.

    Data is transferred from the data generator thread to the data handler callbacks via global variables.

    #!/usr/bin/python3
    from bokeh.plotting import curdoc, figure
    from bokeh.models import Button
    import threading
    import numpy as np
    import time
    
    handler_lock = threading.Semaphore(0)  # handler waiting for data
    generator_lock = threading.Semaphore(0)  # generator waiting for user pushing the button
    
    # THIS IS A BOKEH CALLBACK:
    def data_handler():
        global data
        global title
        global final
        # wait until data is ready
        print("Handler: waiting for data from generator")
        handler_lock.acquire()
        print("Handler: handling data from generator")
        # add new graph
        p = figure(title=title)
        plot_circle = p.circle([],[])
        curdoc().add_root(p)
        plot_circle.data_source.stream(data)
        # add the button for the user to trigger the callback that will draw the next graph
        if not final:
            button = Button(label="Continue data generation", button_type="success")
            button.on_click(data_handler)
            curdoc().add_root(button)
        # unlock the generator to continue generating data
        print("Handler: unlocking generator")
        generator_lock.release()
        # actual drawing is only done after the callback finishes!
    
    # THIS RUNS IN A SEPARATE THREAD:
    def data_generation():
        global data
        global title
        global final
        final = False
    
        # Data generation phase 1 (starts immediately!)
        print("Generator: starting data generation")
        temp_x = np.random.rand(10)
        temp_y = np.random.rand(10)
        #time.sleep(10)  # pretending data generation takes long
        print("Generator: Data generated, going to plot")
        data = {'x': temp_x, 'y': temp_y}
        title = "graph 1"
        # unlock the bokeh callback that is waiting for data
        print("Generator: Unlocking handler")
        handler_lock.release()
    
        # lock the generator until the callback is done
        print("Generator: waiting for handler to finish")
        generator_lock.acquire()
        print("Generator: continuing generating data")
        # Data generation phase 2
        print("Generator: starting data generation")
        temp_x = np.random.rand(10)
        temp_y = np.random.rand(10)
        #time.sleep(10)  # pretending data generation takes long
        print("Data generated, going to plot")
        data = {'x': temp_x, 'y': temp_y}
        title = "graph 2"
        final = True  # no button needed to continue
        # unlock the bokeh callback that is waiting for data
        print("Generator: unlocking handler")
        handler_lock.release()
    
    
    # start data generation in a separate thread
    t = threading.Thread(target=data_generation)
    t.start()
    
    # draw the start button
    button = Button(label="Start handling data", button_type="success")
    button.on_click(data_handler)
    curdoc().add_root(button)
    
    # main script finishes here, but the generator thread is still running and 
    # iterative bokeh drawing is done in callbacks