Graphs not rendering for Bokeh Server Embed in Flask running in Docker Container

lemgog · July 10, 2021, 8:55am

I have a Flask application with a Bokeh server embedded as documented in the example at bokeh/flask_gunicorn_embed.py at ed54705d627ae500509e57fe071fc7a60fc8a951 · bokeh/bokeh · GitHub

However, it seems as if the ports for the Bokeh server are generated dynamically in this code (i.e. cannot be predicted ahead of time and exposed in my Dockerfile), so when I create a Docker container for my Flask application, I don’t have the correct ports exposed. When I visit the route in the Flask app, it renders the webpage but not the Bokeh server document graph on the webpage. No other errors are being thrown, so I assume this has something to do with the port that the Bokeh server is running on not being exposed. The code works fine when running outside of a Docker container. I just cannot render the graphs on my host machine when it is running in a Docker container. Is there another way to approach handling ports with Bokeh embedded in a multi-threaded production application on gunicorn? Or to hard-code in the ports being used for multiple workers?

_jm · July 10, 2021, 4:27pm

Hi @lemgog

The relevant lines of the example you referenced are …

# This is so that if this app is run using something like "gunicorn -w 4" then
# each process will listen on its own port
sockets, port = bind_sockets("localhost", 0)

A zero-valued second argument to the bind_sockets function means that the operating system will choose a free port, and the port return argument indicates what was chosen. See description in the bokeh reference guide here.

You could specify a non-zero second argument in the bind_sockets call to hard-code a port number that is compatible with your docker file configuration.

lemgog · July 10, 2021, 6:46pm

Yes I understand that, however if I hard code a port, will that still function with multiple workers? Won’t every worker then be trying to use the same port? I guess I could add some logic that checks if the port is in use within a for loop (sockets - Fast way to test if a port is in use using Python - Stack Overflow)

and iterates the port number by one every time if it is in use within a particular range of port numbers, and then expose that range in my docker-compose file. It doesn’t seem like the most elegant way to do it though.

for port in range(5006, 5011):
    if port.in_use:
        continue
    else:
        sockets, port = bind_sockets("localhost", port)
        break

_jm · July 10, 2021, 7:43pm

No. It will fail ungracefully – crashing gunicorn – if you hardcode a single number and try to use multiple workers.

As you surmise, there will need to add some additional logic to provide deterministic unique port numbers compatible with your docker file constraints. This could a simple loop over ports as you’ve shown or some other way to maintain state and allocate ports to the workers.

If you go with the loop over ports as in your example, I believe you need to change the logic slightly as the port variable is being used both for the loop iterator and the return argument of bind_sockets.

lemgog · July 11, 2021, 4:09am

Ok, so I was able to implement that and the Bokeh plots render fine within the webpage when I run the flask app just by itself. However, when I run it from within the Docker container, no errors are thrown and the graphs don’t render. I assign the ports within a range as documented above and I have exposed those same ports in my docker-compose.yml. Is there anything else that I need to do or something that I am missing to explain why the plots are not rendering when running in the Docker container but render correctly outside of it?

Perhaps it is my allow-websocket-origins argument. I have it all set to localhost, but the Docker container’s ip address is different, so the origin would be different. Since the IP address is not static, is there a way to account for this without requiring a static ip address assigned to the docker container?

This is what I have tried for the websocket origins. settings.port is the variable that I keep the port that the Bokeh server is running on for a particular thread saved in so the rest of the Flask app can access it. As you can see, I tried adding "*" to the websocket origins as a catch-all, but the plots still are not rendering in the Flask app. The 172.21.0.1 IP address is the IP address of the Docker container that was running.

bokeh_tornado = BokehTornado({'/plot1': plot1, \
        '/plot2':plot2, '/plot3':plot3}, \
        extra_websocket_origins=["127.0.0.1:5000", "127.0.0.1:"+str(settings.port), \
        "0.0.0.0:"+str(settings.port), "localhost:"+str(settings.port), "0.0.0.0:5000", \
            "localhost:5000", "172.21.0.1:5000", "172.21.0.1:"+str(settings.port), "*"])

lemgog · July 11, 2021, 9:45pm

I suspect that my issue is similar to those documented here

github.com/bokeh/bokeh

Embedded Bokeh server doc should expose hook for error reporting

opened 04:16PM - 06 Jul 20 UTC

carve11

type: feature

Based on comment from @bryevdv on [Bokeh Discourse discussion](https://discourse….bokeh.org/t/how-to-get-embedded-bokeh-server-doc-to-report-page-error-to-flask-server/5918/2?u=jonas_grave_kristens) forum I am creating this feature request. The setup is a Bokeh server and Flask server where the flask server is serving embedded `server_document`s. If for some reason the Bokeh server fails to GET a given document and reports a page error, eg 403, then I would like to capture that and by a Flask template inform the the user of an issue instead of the user just looking on the blank page. I have tried to add routes like `@app.errorhandler(403)` but also `@app.teardown_request` in order to try to capture any page error but so far no luck. If I understand correctly as long as flask is able to create the embedded template (the html doc) then that would not return any page error. To show what I mean I have taken the Flask embed example from [git](https://github.com/bokeh/bokeh/tree/branch-2.2/examples/embed/arguments) and used that. In the `flask_server.py` script I have added some error handling routes mentioned above but they are not run if Bokeh server reports a eg 403 GET error. I first start the bokeh server app: ``` bokeh serve bokeh_server.py --allow-websocket-origin=127.0.0.1:5000 ``` then I start the flask app ``` python flask_server.py ``` When running this and going to `localhost:5000`, when one chooses a link I get to the `app_html` page but no embedded plot is shown due to 403 page error reported by Bokeh server. I would like to capture this error and show to the user that there is a page error - how do I do that? Firefox console: ``` Request URL:http://localhost:5006/bokeh_server/ws?bokeh-protocol-version=1.0&bokeh-session-id=MgpZNLFpFlZDDacVqZzjLlWZHmcjCngpWn19QiIZnyXs Request method:GET Remote address:127.0.0.1:5006 Status code: 403 Version:HTTP/1.1 ``` Have tested in Bokeh version 1.3.4 (py2.7) and 2.1.1 (py3.6.8), both on Redhat 7. Unfortunately my limited java script skills are limiting me from putting forward a proposal for a solution. --- ### The scripts: **bokeh_server.py** ``` #!/usr/bin/env python '''This example demonstrates embedding an autoloaded Bokeh server into a simple Flask application, and passing arguments to Bokeh. To view the example, run: python flask_server.py in this directory, and navigate to: http://localhost:5000 ''' import numpy as np from bokeh.io import curdoc from bokeh.layouts import column, row from bokeh.models import ColumnDataSource, Slider, TextInput from bokeh.plotting import figure # Retrieving the arguments args = curdoc().session_context.request.arguments try: batchid = int(args.get('batchid')[0]) except (ValueError, TypeError): batchid = 1 func = { 1 : np.cos, 2 : np.sin, 3 : np.tan }[batchid] # Set up data N = 200 x = np.linspace(0, 4*np.pi, N) y = func(x) source = ColumnDataSource(data=dict(x=x, y=y)) # Set up plot plot = figure(plot_height=400, plot_width=400, title="my wave", tools="crosshair,pan,reset,save,wheel_zoom", x_range=[0, 4*np.pi], y_range=[-2.5, 2.5]) plot.line('x', 'y', source=source, line_width=3, line_alpha=0.6) # Set up widgets text = TextInput(title="title", value="Batch n°{}".format(batchid)) offset = Slider(title="offset", value=0.0, start=-5.0, end=5.0, step=0.1) amplitude = Slider(title="amplitude", value=1.0, start=-5.0, end=5.0) phase = Slider(title="phase", value=0.0, start=0.0, end=2*np.pi) freq = Slider(title="frequency", value=1.0, start=0.1, end=5.1) # Set up callbacks def update_title(attrname, old, new): plot.title.text = text.value text.on_change('value', update_title) def update_data(attrname, old, new): # Get the current slider values a = amplitude.value b = offset.value w = phase.value k = freq.value # Generate the new curve x = np.linspace(0, 4*np.pi, N) y = a*func(k*x + w) + b source.data = dict(x=x, y=y) for w in [offset, amplitude, phase, freq]: w.on_change('value', update_data) # Set up layouts and add to document inputs = column(text, offset, amplitude, phase, freq) curdoc().add_root(row(inputs, plot, width=800)) ``` **flask_server.py** ``` '''This example demonstrates embedding an autoloaded Bokeh server into a simple Flask application, and passing arguments to Bokeh. To view the example, run: python flask_server.py in this directory, and navigate to: http://localhost:5000 ''' import atexit import subprocess from flask import Flask, render_template_string from bokeh.embed import server_document home_html = """ <!DOCTYPE html> <html lang="en"> <body> <div class="bk-root"> <h1><a href="/batch/1"> Batch 1 (cos)</a></h1> <h1><a href="/batch/2"> Batch 2 (sin)</a></h1> <h1><a href="/batch/3"> Batch 3 (tan)</a></h1> </div> </body> </html> """ app_html = """ <!DOCTYPE html> <html lang="en"> <body> <div> <h2><a href="/batch/1">Batch 1 (cos)</a> - <a href="/batch/2">Batch 2 (sin)</a> - <a href="/batch/3">Batch 3 (tan)</a></h2> </div> {{ bokeh_script|safe }} </body> </html> """ err_html = """ <!DOCTYPE html> <html lang="en"> <body> <div> <h2>{{ error }}</h2> </div> </body> </html> """ app = Flask(__name__) #bokeh_process = subprocess.Popen( # ['python', '-m', 'bokeh', 'serve', '--allow-websocket-origin=localhost:5000', 'bokeh_server.py'], stdout=subprocess.PIPE) @atexit.register def kill_server(): bokeh_process.kill() @app.route('/') def home(): return render_template_string(home_html) @app.route('/batch/<int:batchid>') def visualization(batchid): bokeh_script = server_document(url='http://localhost:5006/bokeh_server', arguments=dict(batchid=batchid)) return render_template_string(app_html, bokeh_script=bokeh_script) @app.errorhandler(404) def page_not_found(e): return render_template_string(err_html, error = "404 Not found") @app.errorhandler(403) def page_not_found(e): return render_template_string(err_html, error = "403 Forbidden") @app.teardown_request def teardown_request_func(error=None): print("teardown_request is running") if error: # log error print(str(error)) if __name__ == '__main__': app.run(debug=True) ```

As the server document is not rendering the graphs for some reason that I cannot determine, and no errors are output. Any recommendations on the best way to troubleshoot? Thank you.

Bryan · July 11, 2021, 10:14pm

What does this mean, exactly? Typically there is potentially valuable information in both the browser JavaScript console log, as well as the console log of the Bokeh server process. Have you examined both of these?

In any case, even an absence of information is information. The Bokeh server logs every connection attempt, whether it was successful or rejected. If there is really nothing interesting in your Bokeh server logs, then that means no connections ever even reached it to begin with—your problem is upstream before the Bokeh server (e.g. a misconfigured proxy or a firewall blocking something).

lemgog · July 11, 2021, 10:46pm

So what I mean by this is nothing is sent to stdout on the terminal I am running the Flask app from. So I enter “Flask run”, and there is no output from the Bokeh server there indicating a failure or error in the document script. In the browser, this is what I see:

This makes me think it has something to do with the websocket origins I have specified. This is the method in which I am attempting to access the bokeh server document in my Flask route

spectrogram_plot = server_document('http://localhost:%d/spectrogram_plot' % int(settings.port))

Where settings.port is how I am keeping track of the port for that particular worker thread. Should I be using something different than localhost?

I also have some print statements inside my code that creates the plot, and those print statements never execute while running inside the Docker container.

Or is there another way to view the logs from the Bokeh server that I am not seeing or implementing?

lemgog · July 11, 2021, 11:12pm

Another thing that might be relevant is when I access the route that calls the bokeh server, I get the below output from werkzeug:

However, when I run ifconfig in the Docker container, the IP address is different

I am not sure if this is an issue or not, and don’t really understand why the IP addresses would be different

Bryan · July 11, 2021, 11:16pm

If a Bokeh server rejects a connection because it doesn’t like the origin configuration, that is 100% alwasys logged in the Bokeh server output (including instructions with exactly what change is needed to allow the connection).

The 403s look spurious to me, because those CSS files no longer exist with recent versions of Bokeh. Are you specifying CDN resources loads manually in a template?

Also just to make sure we are on the same page, this is the log output I am talking about:

base ❯ bokeh serve --show sliders.py
2021-07-11 16:13:18,913 Starting Bokeh server version 2.4.0dev1-43-g4aac79f80 (running on Tornado 6.1)
2021-07-11 16:13:18,915 User authentication hooks NOT provided (default user enabled)
2021-07-11 16:13:18,918 Bokeh app running at: http://localhost:5006/sliders
2021-07-11 16:13:18,918 Starting Bokeh server with process id: 1685
2021-07-11 16:13:19,806 404 GET /apple-touch-icon-precomposed.png (::1) 0.43ms
2021-07-11 16:13:19,809 404 GET /apple-touch-icon.png (::1) 0.33ms
2021-07-11 16:13:19,822 WebSocket connection opened
2021-07-11 16:13:19,822 ServerConnection created
2021-07-11 16:13:19,836 404 GET /favicon.ico (::1) 0.92ms

You can see the connection logged there. If the server rejects a connection, that is also logged:

2021-07-11 16:14:36,951 Refusing websocket connection from Origin ‘http://localhost:5006’; use --allow-websocket-origin=localhost:5006 or set BOKEH_ALLOW_WS_ORIGIN=localhost:5006 to permit this; currently we allow origins {‘foo.com:80’}

If neither of those are present in the log, then as I said above, no connection ever reached the server at all. The problem is somewhere earlier in front of Bokeh.

lemgog · July 11, 2021, 11:33pm

So I am not starting the Bokeh server in that manner, I am using the method in this example

github.com

bokeh/bokeh/blob/branch-2.4/examples/howto/server_embed/flask_gunicorn_embed.py

try:
    import asyncio
except ImportError:
    raise RuntimeError("This example requries Python3 / asyncio")

from threading import Thread

from flask import Flask, render_template
from tornado.httpserver import HTTPServer
from tornado.ioloop import IOLoop

from bokeh.application import Application
from bokeh.application.handlers import FunctionHandler
from bokeh.embed import server_document
from bokeh.layouts import column
from bokeh.models import ColumnDataSource, Slider
from bokeh.plotting import figure
from bokeh.sampledata.sea_surface_temperature import sea_surface_temperature
from bokeh.server.server import BaseServer
from bokeh.server.tornado import BokehTornado

This file has been truncated. show original

My code to do so is here:

spectrogram_plot = Application(FunctionHandler(spectrogram_plot))

# This is so that if this app is run using something like "gunicorn -w 4" then
# each process will listen on its own port

def is_port_in_use(port):
    import socket
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
        return s.connect_ex(('localhost', port)) == 0
    
from app import settings
settings.init()
for p in range(5006, 5011):
    if is_port_in_use(p):
        continue
    else:
        sockets, settings.port = bind_sockets("localhost", p)
        break

def bk_worker():
    asyncio.set_event_loop(asyncio.new_event_loop())

    bokeh_tornado = BokehTornado({'/spectrogram_plot':spectrogram_plot}, \
        extra_websocket_origins=["127.0.0.1:5000", "127.0.0.1:"+str(settings.port), \
        "0.0.0.0:"+str(settings.port), "localhost:"+str(settings.port), "0.0.0.0:5000", \
            "localhost:5000", "172.21.0.1:5000", "172.21.0.1:"+str(settings.port), "172.21.0.2:5000", "172.21.0.2:"+str(settings.port), "*"])
    bokeh_http = HTTPServer(bokeh_tornado)
    bokeh_http.add_sockets(sockets)

    server = BaseServer(IOLoop.current(), bokeh_tornado, bokeh_http)
    server.start()
    server.io_loop.start()

t = Thread(target=bk_worker)
t.daemon = True
t.start()

Then within my Flask routes.py, I have

spectrogram_plot = server_document('http://localhost:%d/spectrogram_plot' % int(settings.port))

Since this is the way I access the Bokeh server, no logs are output to stdout. Is there a different way that I should configure this that is better? This is what I see when I run my spectrogram_plot.py via the cmd line

However, I don’t get any graphs (just a blank page) when I visit the provided url in my browser. In the same environment, using just the code embedded in my Flask app, the plots render fine, although I suspect this has something to do with the fact that it is using a database module present in my Flask app that isn’t present when just running the Bokeh script by itself. When I visit the provided url in a Docker environment using the Bokeh serve command, it creates the websocket connection successfully and no graphs render, same as in my non-Docker environment. However, in my non-Docker environment, when access through the Flask app, the graphs render successfully, and they dont in the Docker environment

These are the scripts I have embedded in my html template

      <link rel="stylesheet" href="http://cdn.pydata.org/bokeh/release/bokeh-2.3.0.min.css" type="text/css" />
      <link href="http://cdn.bokeh.org/bokeh/release/bokeh-tables-2.3.0.min.css" rel="stylesheet" type="text/css">
      <script type="text/javascript" src="http://cdn.pydata.org/bokeh/release/bokeh-2.3.0.min.js"></script>
      <script src="https://cdn.bokeh.org/bokeh/release/bokeh-widgets-2.3.0.min.js"
        crossorigin="anonymous"></script>
      <script src="https://cdn.bokeh.org/bokeh/release/bokeh-tables-2.3.0.min.js"
        crossorigin="anonymous"></script>

But from what you are saying, I’m guessing I can discard the one’s that aren’t loading. So yes you are correct, the 403’s are spurious. It seems as if the only relevant section from there is the “Failed to load resource: net::ERR_EMPTY_RESPONSE”

lemgog · July 11, 2021, 11:59pm

Ok here is a side-by-side comparison of me running both via the Bokeh serve command (right side) and the Flask run command (left side)

Here is what the browser shows for accessing the Bokeh-provided url.

And here is the correct rendering from accessing the Flask-provided url

The terminal outputs when running both in the Docker environment is exactly the same, however the graphs do not render at all in the browser and I receive the “Failed to load resource: net::ERR_EMPTY_RESPONSE” in the console. I am trying to determine why this is, as it seems the websocket is created when I visit the URL.

lemgog · July 12, 2021, 12:08am

Here is the same set of outputs, but now this is what is executing in my Docker container.
Output from “flask run” is on the left, and output from “bokeh serve” is on the right.

Here is what the corresponding browsers and consoles look like as well. As you can see, the Flask app (upper picture) now fails to load the resource.

Bryan · July 12, 2021, 12:45am

@lemgog If you are using the server programmatically, I believe you would need to explicitly call Bokeh’s basicConfig to set up logging. It’s not appropriate for a library to unconditionally configure logging, that is something for application code to control (which is why bokeh serve application does it)

bokeh/logconfig.py at branch-2.4 · bokeh/bokeh · GitHub

lemgog · July 12, 2021, 1:11am

@Bryan thank you I will look into implementing that, but even when I access the URL using the bokeh serve command, the result is the same. The logging output just shows that a websocket was created, and the graph does not render in the browser.

It’s odd, it seems to me as if I can connect to the Bokeh server from my Flask app (from the Bokeh server), but the Bokeh server isn’t executing the code for rendering my graph for some reason. I even put some print statements inside the Bokeh script I am using (spectrogram_plot.py) and they never execute. So it seems as if something is happening in-between my connecting to the server and the execution of the graph code.

lemgog · July 12, 2021, 4:41am

Ok, I changed some things around and got the graphs to render!

I changed all of the Bokeh server code to run in it’s own Docker container, then I accessed that URL. That way I was able to see the log data in the terminal. In docker compose, I just used the service name in the url that hosted the bokeh app.

http://visualization:5011/plot

In the returned script tag, the url still refers to “visualization”, which is the name of my docker compose service that my computer won’t know how to resolve, so I had to do a find-replace to replace it with localhost, and then it rendered correctly in my browser.

plot = server_document(os.environ.get('BOKEH_URL'))

plot= test_plot.replace("visualization", "localhost")

One issue I am seeing though that I don’t know how to resolve is if there is a way to have multiple end-points/applications within the same bokeh server instance? If not, then I will have to create a separate Docker container for every Bokeh plot that I want to render. The Flask embed method had a handy way of making multiple Bokeh endpoints/plots available via the following method:

bokeh_tornado = BokehTornado({'/plot1': plot1, \
        '/plot2':plot2, '/plot3':plot3}...

Is there a way to do the same thing when starting Bokeh from the command line?

bokeh serve plot1.py plot2.py plot3.py

The below link mentions that it is possible, but I just want to verify that this is the recommended method to serve multiple plotting scripts and have different endpoints

I mainly just need a way to access different plots at different endpoints if possible.

Bryan · July 12, 2021, 3:38pm

Is there a way to do the same thing when starting Bokeh from the command line?

You can pass as many paths+applications as you like in this dictionary:

BokehTornado({'/spectrogram_plot': spectrogram_plot}, ... )