Search code examples
pythonnginxflaskbokeh

Why is bokeh plot on web server loading so slow?


I really like bokeh and I'd like to have a bokeh plot on my website. Running bokeh locally (on Linux) with bokeh serve works like a charm. But I am having a really hard time getting it to run smoothly on the web server.

I am running it on a DigitalOcean droplet (ubuntu 18.04 with LAMP stack and reverse-proxy nginx). Since I read that bokeh serve is not meant for production, I tried to set up a more secure and robust environment. I used this article to get started. With another DigitalOcean tutorial and lots of online research and trial&error, I finally got it to work.

Well, at least kind of, because it is incredibly slow. To a point that it took me a while noticing it worked in the first place and it is pretty much unusable. It takes usually 4s, sometimes up to 10s (given my internet connection is not very fast).

The html from the template loads pretty fast though. But
bokeh.min.js?v=540... (748KB)
bokeh-widgets.min.js?v=409... (97KB)
bokeh-tables.min.js?v=623... (256KB)
bokeh-gl.min.js?v=823... (62KB)
take very long.

I checked, and this is also the case for the website with the article I used as starting point. The official bokeh examples (e.g. weather) load pretty quick with graphs that are far more complex, but there the aforementioned files are also way smaller.

This is part of the nginx conf file:

    location / {
        include proxy_params;
        proxy_pass http://unix:/home/user/myproject/myproject.sock;
    }

    # reverse proxy to embedded bokeh apps
    location /bokeh/ {
        proxy_pass http://127.0.0.1:5100;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_http_version 1.1;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $host:$server_port;
        proxy_buffering off;
    }

This is the plot file run_bokeh.py:

from bokeh.plotting import figure
from bokeh.layouts import column
from bokeh.server.server import Server

def run(doc):

    x = [1, 2, 3, 4, 5]
    y = [6, 7, 2, 4, 5]

    fig = figure(title="simple line example", x_axis_label='x', y_axis_label='y')
    fig.line(x, y, legend="Temp.", line_width=2)

    layout = column(fig)
    doc.add_root(layout)


kws = {'port': 5100, 'prefix': '/bokeh', 'allow_websocket_origin': ['my-domain.com']}
server = Server(run, **kws)
server.start()
if __name__ == '__main__':
    server.io_loop.add_callback(server.show, '/')
    server.io_loop.start()

and the flask file myproject.py:

from flask import Flask, render_template
from bokeh.embed import server_document

app = Flask(__name__)

@app.route("/")

def index():
    tag = server_document(url=r'/bokeh', relative_urls=True)
    return render_template('index.html', tag=tag)

I couldn't figure out how to make the bokeh.....min.js files smaller, nor if there is something fundamentally wrong with my setup.

PS:

I could find plenty of tutorials (even video tutorials) on how to use bokeh in dev mode, with bokeh serve. But very little on how to use it in production. So if someone can point me in a direction, on how to properly learn using bokeh on a production server, I would appreciate that greatly.

Update

  • I think I didn't articulate properly what I meant. By making the bokeh.....min.js files smaller, I meant that the file that is loaded in my setup is four times as large as the one in the official example or the CDN
    https://my-domain.com/bokeh/static/js/bokeh.min.js?v=547e7d2591695b654def5914bdd697fa → 748 KB
    https://cdn.bokeh.org/bokeh/release/bokeh-1.3.4.min.js → 202 KB
    And I was wondering how to get to the 202 KB

  • I tried to us the CDN but it doesn't work.
    I added from bokeh.resources import CDN to myproject.py but then I couldn't figure out where to insert CDN (server_document(...) is expecting a string for recources)

from flask import Flask, render_template
from bokeh.embed import server_document
from bokeh.resources import CDN

app = Flask(__name__)

@app.route("/")

def index():
    # tag = server_document(url=r'/bokeh', relative_urls=True, CDN)
    tag = server_document(url=r'/bokeh', relative_urls=True, CDN)
    return render_template('index.html', tag=tag)

When I manually add the scripts to the template file index.html, it gets blocked(blocked-mixed-content) and the other scripts are loaded anyways.

<link href="https://cdn.bokeh.org/bokeh/release/bokeh-1.3.4.min.css" rel="stylesheet" type="text/css">
<script src="https://cdn.bokeh.org/bokeh/release/bokeh-1.3.4.min.js"></script>

Solution

  • First thing I should mention is that the next version of Bokeh (1.4) will support:

    • direct SSL termination (i.e. https connections)
    • providing "auth hooks" for login/logout

    That said, there will still be good reasons to run behind a proxy like Nginx sometimes, e.g. for load balancing.

    On to your specific problem: One reason the demo site might load those resources faster is that it is configured to load them from the Bokeh CDN. In the Dockerfile for the demo site, it does:

    ENV BOKEH_RESOURCES=cdn
    

    So you could also try setting the BOKEH_RESOURCES environment variable before running your Bokeh server. Loading from CDN (which is AWS Cloudfront):

    • is probably faster than loading from DO
    • will take some off the load of the Bokeh server itself
    • allows the JS files to be cached by users' browsers

    So that's my first suggestion. If after that, things are still too slow, then we could look at removing unneeded JS files (e.g. if you are not using WebGL or DataTables then at least two of those files are not needed at all). But that might take some more back and forth discussion and experimentation, since I am not sure that's do-able without using the server "programmatically". So I'd say the Bokeh Project Discourse is a better venue that discussion than SO.

    As for:

    I couldn't figure out how to make the bokeh.....min.js files smaller

    There is no way to make them smaller, they are the size they are. In a recent version the size did grow some due to inlining CSS but that is offset by not having to load separate CSS file any longer. The total load size should be roughly equivalent (we actually have CI tests that will fail if the size grows unexpectedly).