Trouble separating instances of bokeh in a flask app

I’m relatively new to using both bokeh and flask and am having some issues with data being shared between instances of my app.

I have removed a lot of my code so that I can include it below:

import datetime as dt
import threading
from os import getenv

import boto3
from awswrangler.timestream import query
from bokeh.application import Application
from bokeh.application.handlers.function import FunctionHandler
from bokeh.client.session import pull_session
from bokeh.embed import server_session
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure
from bokeh.server.server import Server
from bokeh.util.token import generate_session_id
from dotenv import load_dotenv
from tornado.ioloop import IOLoop
from flask import Flask, render_template, request, jsonify

class DataViewer:

    def make_doc(self, doc):
        self.df = None

        self.source = ColumnDataSource(data={"x": [], "y": []}, name="source")

        self.p = figure(
            name="p",
            title="Level",
            sizing_mode="stretch_width",
            height=500,
            x_axis_type="datetime",
            y_range=(0, 1),
            output_backend="webgl",
        )

        self.renderer = self.p.line(x="x", y="y", source=self.source, color="navy")

        self.layout = self.p
        doc.add_root(self.layout)
        self.doc = doc
        return self.doc

    def prepare_view(self):
        self.source.data = dict(
            x=list(viewer.df["time"].apply(
                lambda x: x.replace(tzinfo=dt.timezone.utc).timestamp() * 1000)
            ),
            y=list(viewer.df["measure_value::double"]),
        )
        self.p.x_range.start = self.source.data['x'][0]
        self.p.x_range.end = self.source.data['x'][-1]
        self.p.y_range.start = max(self.source.data['y']) * 1.09
        self.p.y_range.end = min(self.source.data['y']) * 1.09

    def start_server(self, port=5000):
        bokeh_app = Application(FunctionHandler(self.make_doc))
        server = Server(
            {"/bkapp": bokeh_app},
            port=port,
            io_loop=IOLoop(),
            allow_websocket_origin=["*"],
        )
        server.start()
        server.io_loop.start()

load_dotenv()
boto3.setup_default_session(profile_name=getenv("PROFILE"))
app = Flask(__name__)

@app.route("/")
def dataviewer():
    with pull_session(
        session_id=generate_session_id(), url=f"http://localhost:5006/bkapp"
    ) as session:
        script = server_session(
            session_id=session.id, url=f"http://localhost:5006/bkapp"
        )
    return render_template("base.html", script=script)

@app.route("/search", methods=["POST"])
def search():
    print(request.form["imei"])
    try:
        viewer.df = query(
            f"""
            SELECT time, measure_value::double FROM DeviceData."RawData"
            WHERE measure_name = 'WaterDepth' AND Imei = '{request.form['imei']}'
            ORDER BY 1 ASC
            """
        )
    except Exception as e:
        print("get_data error", e)
        return (
            "error getting data",
            500,
        )
    if len(viewer.df) == 0:
        print("No data found")
    viewer.doc.add_next_tick_callback(viewer.prepare_view)
    return jsonify(viewer.source.data)

def bk_worker():
    global viewer
    viewer = DataViewer()
    viewer.start_server(port=5006)

threading.Thread(target=bk_worker).start()

app.run(port=5000)

And here is the html:

<!DOCTYPE html>
<html>

<head>
    <meta charset="utf-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <title>Data Viewer</title>
    <script type="text/javascript" src="https://cdn.jsdelivr.net/jquery/latest/jquery.min.js"></script>
</head>

<body>
    <section class="hero is-primary is-fullheight">
        <div class="hero-body">
            <div class="container has-text-centered">
                <div id="loading-spinner" class="spinner-container">
                    <div class="spinner"></div>
                </div>
                <div class="flex-container">
                    <div class="left">
                        <form id="search">
                            <input name="imei" placeholder="IMEI" autofocus=""
                                pattern="^35(920610|710485|699584|612932|593998|308109|224763|019568)\d{7}"> </input>
                            <button type="submit">Search</button>
                        </form>
                    </div>
                </div>
                {{ script|safe }}
                <script>
                    $(document).ready(function () {
                        imei = 0;
                        $('#search').submit(function (e) {
                            e.preventDefault();
                            if (imei == $('#search input[name="imei"]').val()) {
                                return;
                            }
                            $("#loading-spinner").show();
                            imei = $('#search input[name="imei"]').val();
                            $.ajax({
                                url: 'search',
                                method: 'POST',
                                data: {
                                    imei: imei
                                }
                            }).done(function (response) {
                                let ds = Bokeh.documents[0].get_model_by_name('source');
                                ds.data['x'] = response['x']; ds.data['y'] = response['y'];
                                ds.change.emit();
                                $("#loading-spinner").hide();
                            })
                        });
                    });
                </script>
            </div>
        </div>
    </section>
</body>

</html>

I am using bokeh as a visualiser for an aws timestream database. I originally used the built in widgets but found they did not have as much flexibilty as I needed. This led me to just use html with ajax and direct my requests to flask which would then run the function which would’ve been a callback for a bokeh button.

My issue is that if I have two instances on this app on different tabs they are somehow linked. When making a search from each instance in quick succession, only one instance will update correctly. It seems like the instance that does not work correctly does get some data but the one that does update gets the set of data for each search. I have a video demonstrating this, but as I am a new user I cannot attach it.

I was also having a similar issue with callbacks that I had written which I have omitted from the code above as I believe the underlying issue has nothing to do with them.

I have tried looking at ways to separate instances, namely pull_session and using session ids but I just can’t seem to get this to work.

Any help would be greatly appreciated!

The problem is almost certainly in the way that you are storing some Bokeh models as instance attributes (e.g. self.source etc.) but then re-using the one instance over and over.

The callable that creates a Bokeh document needs to always create an entirely new set of models that are not “shared” between sessions. This is technically happening in your case but note that since you only have the one DataViewer instance, all those instance attributes get overwritten whenever a new document for a new session is created. Bokeh does not care about this, per-se, since it maintains its own data structures per-session internally.

But. Look at your one prepare_view instance method callback, on the one instance of DataViewer, that every session will attempt to call. The callback updates self.source etc but as noted those values, on that one instance of DataViewer are just the whatever the last values created by the latest session overwrote on that one instance. Because of how you have arranged things, self.source, etc are not connected to the session that triggered the callback at all.

If you want something like this to work, you will probably need to create a new DataViewer instance for every session, and then for bookkeeping, either attach it to the session (e.g. just assign it to some session.foo or whatever) or store it in a map of session ids to DataViewer instances or something. Then critically, your search method will need to somehow know the specific session for the page that generated the request, and then use this to locate the specific DataViewer for that session to update. You’ll also need to add some session destruction callbacks to perform cleanup on your DataViewer instances to avoid memory leaks.

All of this assumes you are not behind a load balancer. If that is the case things will be much, much more complicated since there is no guarantee that the process that gets the search request is the same process that generated the session… you would need to make sure that some kind of session affinity is enforceable (some proxies can do this, I don’t know any more about details)

Hi Bryan, thanks for taking the time to respond. Having tried what you’re suggesting I’ve been trying to start the server at the class level then create instances this way so that I can map each instance to a session id. My issue is I don’t know what I should pass to Bokeh’s server class to make this work.

This is my revised code:

import datetime as dt
import re
import threading
from os import getenv

import boto3
from awswrangler.timestream import query
from bokeh.application import Application
from bokeh.application.handlers.function import FunctionHandler
from bokeh.embed import server_session
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure
from bokeh.server.server import Server
from bokeh.util.token import generate_session_id
from dotenv import load_dotenv
from tornado.ioloop import IOLoop
from flask import Flask, render_template, request, jsonify, make_response


class BokehServer:
    def start_server(self, port=5006):
        bokeh_app = Application(FunctionHandler(make_doc))
        server = Server(
            {"bkapp": bokeh_app},
            port=port,
            io_loop=IOLoop(),
            allow_websocket_origin=["*"],
        )
        server.start()
        server.io_loop.start()


class DataViewer:
    def modify_doc(self, doc):
        self.df = None

        self.source = ColumnDataSource(data={"x": [], "y": []}, name="source")

        self.p = figure(
            name="p",
            title="Level",
            sizing_mode="stretch_width",
            height=500,
            x_axis_type="datetime",
            y_range=(0, 1),
            output_backend="webgl",
        )

        self.renderer = self.p.line(x="x", y="y", source=self.source, color="navy")

        self.layout = self.p
        doc.add_root(self.layout)
        self.doc = doc
        return self.doc

    def prepare_view(self):
        self.source.data = dict(
            x=list(
                self.df["time"].apply(
                    lambda x: x.replace(tzinfo=dt.timezone.utc).timestamp() * 1000
                )
            ),
            y=list(self.df["measure_value::double"]),
        )
        self.p.x_range.start = self.source.data["x"][0]
        self.p.x_range.end = self.source.data["x"][-1]
        self.p.y_range.start = max(self.source.data["y"]) * 1.09
        self.p.y_range.end = min(self.source.data["y"]) * 1.09


def make_doc(doc):
    viewer = DataViewer()
    return viewer.modify_doc(doc)


load_dotenv()
boto3.setup_default_session(profile_name=getenv('PROFILE')
app = Flask(__name__)
app.config["SESSION_REFRESH_EACH_REQUEST"] = False

sessions = {}


@app.route("/set_session_id")
def set_session_id():
    resp = make_response()
    resp.set_cookie("session_id", generate_session_id())
    return resp


@app.route("/dataviewer")
def dataviewer():
    sessions[
        re.search(r"session_id=(.+)(\b|;)", request.headers["Cookie"]).group(1)
    ] = DataViewer()
    script = server_session(
        session_id=re.search(r"session_id=(.+)(\b|;)", request.headers["Cookie"]).group(
            1
        ),
        url="http://localhost:5006/bkapp",
    )
    return render_template("base.html", script=script)


@app.route("/search", methods=["POST"])
def search():
    viewer = sessions[
        re.search(r"session_id=(.+)(\b|;)", request.headers["Cookie"]).group(1)
    ]
    print(request.form["imei"])
    try:
        viewer.df = query(
            f"""
            SELECT time, measure_value::double FROM DeviceData."RawData"
            WHERE measure_name = 'WaterDepth' AND Imei = '{request.form['imei']}'
            ORDER BY 1 ASC
            """
        )
        print("done")
    except Exception as e:
        print("get_data error", e)
        return (
            "error getting data",
            500,
        )
    if len(viewer.df) == 0:
        print("No data found")
    viewer.doc.add_next_tick_callback(viewer.prepare_view)
    return jsonify(viewer.source.data)


def bk_worker():
    server = BokehServer()
    server.start_server()


threading.Thread(target=bk_worker).start()

app.run(port=5000)

when trying to open an instance I am receiving the following error in my browser:

dataviewer:1 Access to XMLHttpRequest at ‘http://localhost:5006/bkapp/autoload.js?bokeh-autoload-element=ccc47a4a-56ed-441f-b65c-95e632e4c1ce&bokeh-app-path=/bkapp&bokeh-absolute-url=http://localhost:5006/bkapp’ from origin ‘http://127.0.0.1:5000’ has been blocked by CORS policy: Response to preflight request doesn’t pass access control check: No ‘Access-Control-Allow-Origin’ header is present on the requested resource.

I hope I’ve made sense.

The default set of allowed websocket origins only includes localhost:5006 [1] so if you want to embed from other origins (even 127.0.0.1) then you’ll need to explicitly configure those allowed origins (or allow any origin with the value "*". There are several topics on allowed websocket origin here you can search as a first step.


  1. The allowed websocket origin list is what controls which sites may embed a Bokeh app. Most users would not want anyone anywhere to be able to embed their app without their knowledge or consent, so the default value must be defined narrowly. ↩︎

After making these changes I am still receiving the same error. On closer inspection it seems that the bokeh app is not displaying either. By allowing websocket origin to localhost:5006 then opening this address once the app is running I receive a 404 error. In the past this has successfully displayed the bokeh server instance without the components provided by flask.

This suggests to me that I must still be doing something wrong with the way that I am initialising the bokeh server within my flask app.

Never mind. Turns out what was causing the issue was a missing ‘/’ when defining the route of the bokeh app in my BokehServer class. Thanks for your help.