Passing arguments to bokeh from Flask + num-procs

Hi,

I use a bokeh server that is loaded when accessing a flask route. I am trying to make the server work with num-procs != 1. There is no problem when using num-procs 1

The bokeh server is started with :

panel serve --num-procs 0 --use-xheaders --port $BOKEH_PORT --address 0.0.0.0 --allow-websocket-origin=* $CYCLOBS_HTTP_DIR/bokeh_apps/[^__]* &

@app.route('/cyclone/<sid>')
def trackServer(sid):
   #Stuff ...

    # Retrieve the script to include the bokeh app to the page.
    script = server_document('http://derive.ifremer.fr/bokeh_server/panel', arguments={"sid": sid})

   # Other stuff...

    return render_template("cyclone.html", script=script, resources=CDN.render(), template="Flask",
                           years=getAllAvailYears(),
                           cycInfos=cycInfos, year=cycInfos[1].year, basin=cycInfos[4][0], sid=sid, hasAcq=hasAcq)

In server_document function, I am passing a URL argument ‘sid’.

What is not working :
When using num-procs != 1, and accessing the bokeh server through the flask route, the bokeh server application does not always receive the url argument.
To be more precise :

  • The flask route is accessed
  • The bokeh server loads a first time, always correctly receiving the URL argument
  • Soon after, the bokeh server seems to load the app a second time, this time the URL argument is not always received (curdoc().session_context.request.arguments is empty)
    After the second loading, if the url argument is correctly received, the application shows in the browser. If not received, the app crash because it needs this argument.

In my code, the line failing is :
args = curdoc().session_context.request.arguments because the returned dict should contain my url argument name and value

What I tried :
It is correctly working when directly accessing the bokeh application in my browser (not going through the flask route) When doing this, the bokeh app still seems to load 2 times, but both times it correctly gets the URL argument.
Could it be an issue with server_document ? Something I’m doing wrong ?

Probably related but issues are closed (and fixed):

What version of Bokeh?

bokeh                     2.0.2                    pypi_0    pypi
panel                     0.10.0a4                 pypi_0    pypi

If it can help :

    script = server_document('http://derive.ifremer.fr/bokeh_server/panel', arguments={"sid": sid})

    print(script)
<script id="1003">
  var xhr = new XMLHttpRequest()
  xhr.responseType = 'blob';
  xhr.open('GET', "http://derive.ifremer.fr/bokeh_server/panel/autoload.js?bokeh-autoload-element=1003&bokeh-app-path=/bokeh_server/panel&bokeh-absolute-url=http://derive.ifremer.fr/bokeh_server/panel&sid=io022020", true);
  
  xhr.onload = function (event) {
    var script = document.createElement('script'),
    src = URL.createObjectURL(event.target.response);
    script.src = src;
    document.body.appendChild(script);
  };
xhr.send();
</script>

@Skealz This should be resolve, since as of 2.x all HTPP request arguments are stuffed inside the websocket session token. So it should not matter if the websocket connection lands on a different process than the original HTTP request. But if you are saying you are seeing an issue, then it will require real investigation. For that we absolutely will need a complete, minimal, reproducer we can run ourselves to look into.

cc @Philipp_Rudiger in case you have any ideas

I’ll work on an example and post it here.

1 Like

@Bryan You can find the minimal example I’ve made here
https://git.theocevaer.fr/theo/exampleprocs

@Skealz when I run that code flask returns a 404 just on the request to 127.0.0.1:5000

dev) ❯  * Serving Flask app "flask_app.py"
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
2020-06-16 02:21:35,444 Starting Bokeh server version 2.1.0+6.g5e72ad3e4.dirty (running on Tornado 6.0.3)
2020-06-16 02:21:35,444 Host wildcard '*' will allow connections originating from multiple (or possibly all) hostnames or IPs. Use non-wildcard values to restrict access explicitly
2020-06-16 02:21:35,445 User authentication hooks NOT provided (default user enabled)
2020-06-16 02:21:35,454 Bokeh app running at: http://0.0.0.0:5006/bokeh_app
2020-06-16 02:21:35,454 Bokeh app running at: http://0.0.0.0:5006/bokeh_app
2020-06-16 02:21:35,455 Starting Bokeh server with process id: 79838
2020-06-16 02:21:35,455 Starting Bokeh server with process id: 79839
2020-06-16 02:21:35,455 Bokeh app running at: http://0.0.0.0:5006/bokeh_app
2020-06-16 02:21:35,455 Starting Bokeh server with process id: 79840
2020-06-16 02:21:35,455 Bokeh app running at: http://0.0.0.0:5006/bokeh_app
2020-06-16 02:21:35,456 Starting Bokeh server with process id: 79841
127.0.0.1 - - [16/Jun/2020 02:21:42] "GET / HTTP/1.1" 404 -

This is because the flask route expects an URL “parameter” ie :
http://localhost:5000/anyvalue
http://localhost:5000/35
http://localhost:5000/bl540

The “anyvalue” should be then printed in the console by the bokeh application

also the "stop` command does not clean up any of the running procs :confused:

If access http://127.0.0.1:5000/test?foo=bar and reload ~50 times in rapid succession, I always see:

ARGS  {'foo': [b'bar']}

as expected. Are you referring to something else? Or are you saying you are not always seeing the HTTP request args like this?

OK I can eventually see that print statement absent. A more helpful app code is this:

curdoc().add_root(Div(text='ARGS ' + str(curdoc().session_context.request.arguments)))
print('ARGS', curdoc().session_context.request.arguments)

Which shows

ARGS {'bokeh-autoload-element': [b'1006'], 'bokeh-app-path': [b'/bokeh_app'], 'bokeh-absolute-url': [b'http://localhost:5006/bokeh_app'], 'value': [b'test']}

in the cases where the print output is missing.

@Philipp_Rudiger I think this is a consequence of the WS landing on the same process as the original HTTP request. In that case, the existing session is re-used (i.e. the app code is not run again). Which is also consistent with it happening relatively infrequently when n=4.

A short term solution may simply be to always re-create an app session by running the app code on the WS connect. That’s obviously expensive/wasteful in many cases, though. The best solution would be to only run it once on the ws connect, and not run it at all on the HTTP connection. But that would require some breaking changes (as we have previously discussed)

@Skealz You should make a GitHub issue with the information you provided. I am afraid I don’t have any workaround to provide in the immediate term.

1 Like

That’s interesting.

You are loading URL that looks like :
http://127.0.0.1:5000/test?foo=bar
When I try this URL, I have the following behavior :

127.0.0.1 - - [16/Jun/2020 11:36:22] "GET /test?foo=bar HTTP/1.1" 200 -
ARGS  {'bokeh-autoload-element': [b'1004'], 'bokeh-app-path': [b'/bokeh_app'], 'bokeh-absolute-url': [b'http://localhost:5006/bokeh_app'], 'value': [b'test']}
2020-06-16 11:36:22,639 WebSocket connection opened
ARGS  {'foo': [b'bar']}
2020-06-16 11:36:22,641 ServerConnection created

value=test is correctly passed the first time but not the second one
foo=bar is correctly passed the second time but not the first one

In this minimal application, the value given after the “host” portion of the URL is transferred to bokeh. Check for the value variable in this snippet.

@app.route('/<value>')
def index(value):
    # Retrieve the script to include the bokeh app to the page.
    script = server_document('http://localhost:5006/bokeh_app', arguments={"value": value})

    return render_template("app.html", script=script, resources=CDN.render())

Well, I guess that might be another (probably related) issue. As ai mentioned the appropriate next step at this point is a report on GitHub.

A short-term solution that may work is to do
http://localhost:5000?value=test

then

from flask import request

@app.route("/")
def index():
    urlParamValue = request.args.get('value')


    # Retrieve the script to include the bokeh app to the page.
    script = server_document('http://localhost:5006/bokeh_app', arguments={"value": urlParamValue})

    return render_template("app.html", script=script, resources=CDN.render())

I will create an issue on github today

I confirm that the code I posted works as a short-term solution.

Thanks for your help !

@Philipp_Rudiger actually at first I didn’t notice that the args were passed in the embedding call. Is there a code path that prefers url args over any sent by the embedding call (there must be)? Fixing that would be sufficient I think.

I think I’m missing some context here, did you mean to tag me?

@Philipp_Rudiger Yes, this is an issue with the processing of HTTP request arguments, which go through the session token. The situation more summarized is this:

  • Bokeh app embedded with arguments supplied:
    script = server_document(..., arguments=dict(foo="bar"))
    
    

Now, a user navigates to the page that embeds the app, and also supplies args on the URL:

https://some.flaskapp.com/stuff?baz=quux

If --num-procs=1 then the server HTTP and ws connections land on the same process. The app code is executed once for the session, and sees foo=bar

However, if --num-procs > 1` then one of two things can happen:

  • HTTP and ws land on the same process.

    The app code is executed once for the session, and sees foo=bar

  • HTTP and ws land on the different process.

    The app code is executed twice for the session, but the executions seen different values for arguments. The first exection (for the HTTP connect) sees foo=bar. But the second execution (for the ws connect on the other process) does not. Instead that app code sees baz=quux from the browser URL.

My hypothesis here is that somewhere there is a condition to choose HTTP args either out of the token, or off the URL, and that it is doing the wrong thing. It should always prefer any arguments from the token over any in the URL. (In fact for embedding specifically it’s possible we should ignore the URL, since the URL is for the Flask or whatever app, and not for the Bokeh app)

2 Likes