Thanks for all your help. It turns out that ZeroMQ does have a polling interface, so I was able to put that in a periodic callback function and now it seems to work well. Bokeh really is an incredible package and I’m really glad I found it.
As I build out the rest of my PoC, the next question is how to best build up to more complicated pages. Eventually, I’ll need to have multiple plots in a page, along with other content on the page not generated by Bokeh, https, authentication, etc. Is this something that the bokeh server can fit into? Should it be in control of the app, or should it just be somehow embedded into the pages where the plots are needed? I’m guessing I probably need to dig into tornado
···
On Tue, Jan 19, 2016 at 4:11 AM, Havoc Pennington [email protected] wrote:
I think the main thing is what you’re familiar with and like to use. This is a reasonable use of bokeh server - what I’ve described here I don’t consider a hack or anything, it should work well (unless there’s some bug because it’s new code). It does avoid JavaScript, is fairly concise, and it ought to be pretty efficient since you are all async with only the one thread. I don’t know a lot about flask so I can’t compare how you’d do this there.
If we did the new API I was thinking about at the end of my post this kind of thing would be almost trivial so that direction seems exciting…
Havoc
On Jan 19, 2016, at 12:15 AM, Matt Ettus [email protected] wrote:
Havoc,
Thanks for the detailed response. Maybe I’ll take another tack here. I originally started by modifying the audio spectrogram example, which is based on flask and threads. I was able to get it doing useful and interesting plots, but I decided to make the switch to using the bokeh server for a couple of reasons. First, I found myself writing both python and coffeescript/javascript with a lot of overlap, and I felt like it was a lot of repetition. Second, it seemed like the bokeh server was “the right way” to be doing this. Would I be better off sticking with the flask model, or is there a third way I am missing?
Thanks,
Matt
–
You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/CAN%3D1kn_cwzeGi2%2BtFPjq6fTLg-A1eKhsaPyxxMu%3DFQ4sj5fKLw%40mail.gmail.com.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
–
You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/184439C8-1A0A-47E2-846E-B281AD72BC18%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
On Mon, Jan 18, 2016 at 7:52 PM, Havoc Pennington [email protected] wrote:
Just thinking, you could do a similar thing to my suggestion in the main script without getting into the lifecycle hooks. Use document.add_next_tick_callback instead of server_context.add_next_tick_callback, and your callback can update only curdoc() and will have the document automatically locked.
So you avoid ServerContext, SessionContext, and explicit document locking.
The trade off is a separate thread per session though.
Havoc
On Jan 18, 2016, at 10:22 PM, Havoc Pennington [email protected] wrote:
Hi,
This gets pretty “fun”, unfortunately async IO and threads are “fun” and once you are doing certain things it’s hard to completely conceal the fun… I’ll try to give you some leads.
I don’t know
what’s up with the pyzmq exception… if you could figure that out, it’s potentially an easier path than threads. However, I’ll try to explain below what to do if you have only blocking IO available, and so have to use a thread… and maybe it will also be useful for others who have to use a thread.
On Mon, Jan 18, 2016 at 6:21 PM, Matt Ettus [email protected] wrote:
–
You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/C5E604C2-A857-46EE-94EA-6751AAC804ED%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
RuntimeError: _pending_writes should be non-None when we have a document lock, and we should have the lock when the document changes
The lock here is a Tornado lock, so it’s protecting against access while someone is waiting on a Tornado event (during a yield in a coroutine), but it doesn’t protect against multithreaded access. Unfortunately Bokeh has no thread locks. See http://www.tornadoweb.org/en/stable/locks.html
Normally, all code in the session script (the one you pass to bokeh serve, that fills in the session for each browser tab) is run with the lock held. However, by creating the thread, you are “escaping” the scope of the lock that’s normally held.
The following probably should be made to work, but isn’t safe right now: save session_doc = curdoc() in a global var in the session script, then do session_doc.add_next_tick_callback() in your thread which would add a callback that tornado will run in its thread. This could be made to work, but right now it updates Bokeh data structures that do not have thread locks on them, so it probably works 99% of the time and then breaks 1% of the time. Yay threads. Some work is needed for this to be safe. Tornado supports calling IOLoop.add_callback from another thread, which is what add_next_tick_callback uses, but Bokeh has its own book-keeping around add_next_tick_callback that isn’t threadsafe.
However, that isn’t the best way to implement your app I think anyway, and the reason is that you’d have a thread per session (session = browser tab typically), when I’m guessing a thread per process makes more sense if only to avoid memory leakage.
Where you can set up a thread per process is in the on_server_loaded lifecycle callback. See
http://bokeh.pydata.org/en/latest/docs/user_guide/server.html#lifecycle-hooks
So you’d spawn the thread there then return from on_server_loaded.
Your app will have to be a “directory” app (session script in main.py, on_server_loaded in server_lifecycle.py, pass the directory path to bokeh serve)
Once you have a ServerContext as passed to on_server_loaded, you can look at server_context.sessions which is an iterable of SessionContext.
In the lifecycle callbacks, the document lock is not held until you take it explicitly using SessionContext.with_locked_document
https://github.com/bokeh/bokeh/blob/23f2577ba583402b9574a6067aebb82a99bf989e/bokeh/application/application.py#L84-L94
(this doesn’t appear to be in the generated reference docs, some sort of oversight, but there are doc strings in the code)
To update your sessions you’d do something like this (untested) in server_lifecycle.py:
this is a callback that runs forever, but yields back to the IOLoop
whenever it’s waiting on a future
@gen.coroutine
def update_sessions_callback(server_context):
while True:
data = yield get_a_future_with_new_data()
# now we have to update ALL the sessions... this function called
# with document lock held
def update_one_session(doc):
# modify doc however you want in here
source = doc.get_model_by_name("my-source")
source.data = dict(x=x, y=data)
# iterate over all sessions
for session_context in server_context.sessions:
# be sure they haven't expired
if not session_context.destroyed:
# wait until we get the lock, then modify the session
yield session_context.with_locked_document(update_one_session)
this is called on server startup
def on_server_loaded(server_context):
server_context.add_next_tick_callback(lambda: update_sessions_callback(server_context))
If the “gen” stuff doesn’t make sense, check out the docs on Tornado http://tornadokevinlee.readthedocs.org/en/latest/gen.html
update_sessions_callback is an infinite loop in the server thread, BUT at each “yield” it allows other server code to run.
What you don’t have here yet is a way to implement get_a_future_with_new_data(), which is a function returning a Future, where the Future is completed with each chunk of new data.
I’ll leave this a bit as an exercise in part because I’d have to do some figuring and I don’t know for sure what I’m about to suggest will work, but I think you’d want to use concurrent.futures.Future (which is threadsafe), NOT the tornado Future which isn’t. https://docs.python.org/3.4/library/concurrent.futures.html#concurrent.futures.Future
The idea I would have is that get_a_future_with_new_data() would return a concurrent.futures.Future; the docs say the only way to create one of those is with Executor.submit, so you could have a global executor with one thread, and submit tasks to it that return your data.
Something like:
note this executor is global, not made anew on every function call
executor = ThreadPoolExecutor(max_workers=1)
def recv_task():
while True:
# block until we have something to put in the future
raw_data = np.frombuffer(zmq_socket.recv(),dtype=np.complex64)
signal = raw_data / 1.0 # convert to proper type
if(signal.size >199):
y = signal.real[0:200]
return y # this completes the Future
else:
keep looping
pass
get_a_future_with_new_data() is something like:
def get_a_future_with_new_data():
return executor.submit(recv_task)
I think that would work pretty well, but I haven’t done too much stuff like this in Python and the Bokeh code around this is brand new. We do need to write a document about how to use a data source to update all sessions in an app, I expect this to be a very common thing to do.
Maybe Bokeh could have some sort of way to do this where we do all the boilerplate here and you’d only have to write the recv_task part and the update_one_session part. But anyway, 0.11 has no such miracle in it.
A cool API could be a thing like:
application writes these
def fetch_one_update():
return np.frombuffer(zmq_socket.recv(),dtype=np.complex64)
def update_one_session(doc, update):
“update” is whatever fetch_one_update returns,
and this function is called with the session lock held
in the main IOLoop thread
doc.get_model_by_name(“my_source”).data[‘x’] = update
Bokeh API looks like
server_context.add_session_updater(fetch_one_update, update_one_session)
Not sure, really just thought of this just now so I’m sure it could be better. API could have separate versions for blocking IO (makes a thread) and nonblocking IO (fetch_one_update callback returns a Future, no thread). Anyway the idea is that add_session_updater does the stuff I just wrote out above for you, except for the app-specific bits.
Havoc