Graphs not rendering for Bokeh Server Embed in Flask running in Docker Container

I have a Flask application with a Bokeh server embedded as documented in the example at bokeh/flask_gunicorn_embed.py at ed54705d627ae500509e57fe071fc7a60fc8a951 · bokeh/bokeh · GitHub

However, it seems as if the ports for the Bokeh server are generated dynamically in this code (i.e. cannot be predicted ahead of time and exposed in my Dockerfile), so when I create a Docker container for my Flask application, I don’t have the correct ports exposed. When I visit the route in the Flask app, it renders the webpage but not the Bokeh server document graph on the webpage. No other errors are being thrown, so I assume this has something to do with the port that the Bokeh server is running on not being exposed. The code works fine when running outside of a Docker container. I just cannot render the graphs on my host machine when it is running in a Docker container. Is there another way to approach handling ports with Bokeh embedded in a multi-threaded production application on gunicorn? Or to hard-code in the ports being used for multiple workers?

Hi @lemgog

The relevant lines of the example you referenced are …

# This is so that if this app is run using something like "gunicorn -w 4" then
# each process will listen on its own port
sockets, port = bind_sockets("localhost", 0)

A zero-valued second argument to the bind_sockets function means that the operating system will choose a free port, and the port return argument indicates what was chosen. See description in the bokeh reference guide here.

You could specify a non-zero second argument in the bind_sockets call to hard-code a port number that is compatible with your docker file configuration.

1 Like

Yes I understand that, however if I hard code a port, will that still function with multiple workers? Won’t every worker then be trying to use the same port? I guess I could add some logic that checks if the port is in use within a for loop (sockets - Fast way to test if a port is in use using Python - Stack Overflow)

and iterates the port number by one every time if it is in use within a particular range of port numbers, and then expose that range in my docker-compose file. It doesn’t seem like the most elegant way to do it though.

for port in range(5006, 5011):
    if port.in_use:
        continue
    else:
        sockets, port = bind_sockets("localhost", port)
        break

No. It will fail ungracefully – crashing gunicorn – if you hardcode a single number and try to use multiple workers.

As you surmise, there will need to add some additional logic to provide deterministic unique port numbers compatible with your docker file constraints. This could a simple loop over ports as you’ve shown or some other way to maintain state and allocate ports to the workers.

If you go with the loop over ports as in your example, I believe you need to change the logic slightly as the port variable is being used both for the loop iterator and the return argument of bind_sockets.

Ok, so I was able to implement that and the Bokeh plots render fine within the webpage when I run the flask app just by itself. However, when I run it from within the Docker container, no errors are thrown and the graphs don’t render. I assign the ports within a range as documented above and I have exposed those same ports in my docker-compose.yml. Is there anything else that I need to do or something that I am missing to explain why the plots are not rendering when running in the Docker container but render correctly outside of it?

Perhaps it is my allow-websocket-origins argument. I have it all set to localhost, but the Docker container’s ip address is different, so the origin would be different. Since the IP address is not static, is there a way to account for this without requiring a static ip address assigned to the docker container?

This is what I have tried for the websocket origins. settings.port is the variable that I keep the port that the Bokeh server is running on for a particular thread saved in so the rest of the Flask app can access it. As you can see, I tried adding "*" to the websocket origins as a catch-all, but the plots still are not rendering in the Flask app. The 172.21.0.1 IP address is the IP address of the Docker container that was running.

bokeh_tornado = BokehTornado({'/plot1': plot1, \
        '/plot2':plot2, '/plot3':plot3}, \
        extra_websocket_origins=["127.0.0.1:5000", "127.0.0.1:"+str(settings.port), \
        "0.0.0.0:"+str(settings.port), "localhost:"+str(settings.port), "0.0.0.0:5000", \
            "localhost:5000", "172.21.0.1:5000", "172.21.0.1:"+str(settings.port), "*"])

I suspect that my issue is similar to those documented here

As the server document is not rendering the graphs for some reason that I cannot determine, and no errors are output. Any recommendations on the best way to troubleshoot? Thank you.

What does this mean, exactly? Typically there is potentially valuable information in both the browser JavaScript console log, as well as the console log of the Bokeh server process. Have you examined both of these?

In any case, even an absence of information is information. The Bokeh server logs every connection attempt, whether it was successful or rejected. If there is really nothing interesting in your Bokeh server logs, then that means no connections ever even reached it to begin with—your problem is upstream before the Bokeh server (e.g. a misconfigured proxy or a firewall blocking something).

So what I mean by this is nothing is sent to stdout on the terminal I am running the Flask app from. So I enter “Flask run”, and there is no output from the Bokeh server there indicating a failure or error in the document script. In the browser, this is what I see:

This makes me think it has something to do with the websocket origins I have specified. This is the method in which I am attempting to access the bokeh server document in my Flask route

spectrogram_plot = server_document('http://localhost:%d/spectrogram_plot' % int(settings.port))

Where settings.port is how I am keeping track of the port for that particular worker thread. Should I be using something different than localhost?

I also have some print statements inside my code that creates the plot, and those print statements never execute while running inside the Docker container.

Or is there another way to view the logs from the Bokeh server that I am not seeing or implementing?

Another thing that might be relevant is when I access the route that calls the bokeh server, I get the below output from werkzeug:

However, when I run ifconfig in the Docker container, the IP address is different

I am not sure if this is an issue or not, and don’t really understand why the IP addresses would be different

If a Bokeh server rejects a connection because it doesn’t like the origin configuration, that is 100% alwasys logged in the Bokeh server output (including instructions with exactly what change is needed to allow the connection).

The 403s look spurious to me, because those CSS files no longer exist with recent versions of Bokeh. Are you specifying CDN resources loads manually in a template?

Also just to make sure we are on the same page, this is the log output I am talking about:

base ❯ bokeh serve --show sliders.py
2021-07-11 16:13:18,913 Starting Bokeh server version 2.4.0dev1-43-g4aac79f80 (running on Tornado 6.1)
2021-07-11 16:13:18,915 User authentication hooks NOT provided (default user enabled)
2021-07-11 16:13:18,918 Bokeh app running at: http://localhost:5006/sliders
2021-07-11 16:13:18,918 Starting Bokeh server with process id: 1685
2021-07-11 16:13:19,806 404 GET /apple-touch-icon-precomposed.png (::1) 0.43ms
2021-07-11 16:13:19,809 404 GET /apple-touch-icon.png (::1) 0.33ms
2021-07-11 16:13:19,822 WebSocket connection opened
2021-07-11 16:13:19,822 ServerConnection created
2021-07-11 16:13:19,836 404 GET /favicon.ico (::1) 0.92ms

You can see the connection logged there. If the server rejects a connection, that is also logged:

2021-07-11 16:14:36,951 Refusing websocket connection from Origin ‘http://localhost:5006’; use --allow-websocket-origin=localhost:5006 or set BOKEH_ALLOW_WS_ORIGIN=localhost:5006 to permit this; currently we allow origins {‘foo.com:80’}

If neither of those are present in the log, then as I said above, no connection ever reached the server at all. The problem is somewhere earlier in front of Bokeh.

So I am not starting the Bokeh server in that manner, I am using the method in this example

My code to do so is here:

spectrogram_plot = Application(FunctionHandler(spectrogram_plot))

# This is so that if this app is run using something like "gunicorn -w 4" then
# each process will listen on its own port

def is_port_in_use(port):
    import socket
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
        return s.connect_ex(('localhost', port)) == 0
    
from app import settings
settings.init()
for p in range(5006, 5011):
    if is_port_in_use(p):
        continue
    else:
        sockets, settings.port = bind_sockets("localhost", p)
        break

def bk_worker():
    asyncio.set_event_loop(asyncio.new_event_loop())

    bokeh_tornado = BokehTornado({'/spectrogram_plot':spectrogram_plot}, \
        extra_websocket_origins=["127.0.0.1:5000", "127.0.0.1:"+str(settings.port), \
        "0.0.0.0:"+str(settings.port), "localhost:"+str(settings.port), "0.0.0.0:5000", \
            "localhost:5000", "172.21.0.1:5000", "172.21.0.1:"+str(settings.port), "172.21.0.2:5000", "172.21.0.2:"+str(settings.port), "*"])
    bokeh_http = HTTPServer(bokeh_tornado)
    bokeh_http.add_sockets(sockets)

    server = BaseServer(IOLoop.current(), bokeh_tornado, bokeh_http)
    server.start()
    server.io_loop.start()

t = Thread(target=bk_worker)
t.daemon = True
t.start()

Then within my Flask routes.py, I have

spectrogram_plot = server_document('http://localhost:%d/spectrogram_plot' % int(settings.port))

Since this is the way I access the Bokeh server, no logs are output to stdout. Is there a different way that I should configure this that is better? This is what I see when I run my spectrogram_plot.py via the cmd line

However, I don’t get any graphs (just a blank page) when I visit the provided url in my browser. In the same environment, using just the code embedded in my Flask app, the plots render fine, although I suspect this has something to do with the fact that it is using a database module present in my Flask app that isn’t present when just running the Bokeh script by itself. When I visit the provided url in a Docker environment using the Bokeh serve command, it creates the websocket connection successfully and no graphs render, same as in my non-Docker environment. However, in my non-Docker environment, when access through the Flask app, the graphs render successfully, and they dont in the Docker environment

These are the scripts I have embedded in my html template

      <link rel="stylesheet" href="http://cdn.pydata.org/bokeh/release/bokeh-2.3.0.min.css" type="text/css" />
      <link href="http://cdn.bokeh.org/bokeh/release/bokeh-tables-2.3.0.min.css" rel="stylesheet" type="text/css">
      <script type="text/javascript" src="http://cdn.pydata.org/bokeh/release/bokeh-2.3.0.min.js"></script>
      <script src="https://cdn.bokeh.org/bokeh/release/bokeh-widgets-2.3.0.min.js"
        crossorigin="anonymous"></script>
      <script src="https://cdn.bokeh.org/bokeh/release/bokeh-tables-2.3.0.min.js"
        crossorigin="anonymous"></script>

But from what you are saying, I’m guessing I can discard the one’s that aren’t loading. So yes you are correct, the 403’s are spurious. It seems as if the only relevant section from there is the “Failed to load resource: net::ERR_EMPTY_RESPONSE”

Ok here is a side-by-side comparison of me running both via the Bokeh serve command (right side) and the Flask run command (left side)

Here is what the browser shows for accessing the Bokeh-provided url.

And here is the correct rendering from accessing the Flask-provided url

The terminal outputs when running both in the Docker environment is exactly the same, however the graphs do not render at all in the browser and I receive the “Failed to load resource: net::ERR_EMPTY_RESPONSE” in the console. I am trying to determine why this is, as it seems the websocket is created when I visit the URL.

Here is the same set of outputs, but now this is what is executing in my Docker container.
Output from “flask run” is on the left, and output from “bokeh serve” is on the right.

Here is what the corresponding browsers and consoles look like as well. As you can see, the Flask app (upper picture) now fails to load the resource.

@lemgog If you are using the server programmatically, I believe you would need to explicitly call Bokeh’s basicConfig to set up logging. It’s not appropriate for a library to unconditionally configure logging, that is something for application code to control (which is why bokeh serve application does it)

bokeh/logconfig.py at branch-2.4 · bokeh/bokeh · GitHub

@Bryan thank you I will look into implementing that, but even when I access the URL using the bokeh serve command, the result is the same. The logging output just shows that a websocket was created, and the graph does not render in the browser.

It’s odd, it seems to me as if I can connect to the Bokeh server from my Flask app (from the Bokeh server), but the Bokeh server isn’t executing the code for rendering my graph for some reason. I even put some print statements inside the Bokeh script I am using (spectrogram_plot.py) and they never execute. So it seems as if something is happening in-between my connecting to the server and the execution of the graph code.

Ok, I changed some things around and got the graphs to render!

I changed all of the Bokeh server code to run in it’s own Docker container, then I accessed that URL. That way I was able to see the log data in the terminal. In docker compose, I just used the service name in the url that hosted the bokeh app.

http://visualization:5011/plot

In the returned script tag, the url still refers to “visualization”, which is the name of my docker compose service that my computer won’t know how to resolve, so I had to do a find-replace to replace it with localhost, and then it rendered correctly in my browser.

plot = server_document(os.environ.get('BOKEH_URL'))

plot= test_plot.replace("visualization", "localhost")

One issue I am seeing though that I don’t know how to resolve is if there is a way to have multiple end-points/applications within the same bokeh server instance? If not, then I will have to create a separate Docker container for every Bokeh plot that I want to render. The Flask embed method had a handy way of making multiple Bokeh endpoints/plots available via the following method:

bokeh_tornado = BokehTornado({'/plot1': plot1, \
        '/plot2':plot2, '/plot3':plot3}...

Is there a way to do the same thing when starting Bokeh from the command line?

bokeh serve plot1.py plot2.py plot3.py

The below link mentions that it is possible, but I just want to verify that this is the recommended method to serve multiple plotting scripts and have different endpoints

I mainly just need a way to access different plots at different endpoints if possible.

Is there a way to do the same thing when starting Bokeh from the command line?

You can pass as many paths+applications as you like in this dictionary:

BokehTornado({'/spectrogram_plot': spectrogram_plot}, ... )