Multiple datasource patch/stream

Hi everyone,

I have been playing around with bokeh for the last few weeks, and I managed to do some nice things.

However I still cannot get multiple datasource to patch/stream to the webpage/document. Whether it is in a table or in a figure… This should be doable right ?

I was wondering if someone had some code with this usecase already working, so that I can use it as an example ?

Thanks for any help…

Have you tried the example in the docs? This worked really well when I was trying to figure out this library :).

Hi @asmodehn it would also really help focus the discussion if you share actual code for what you have attempted.

Thanks for the quick reply.

Yes I’ve seen the documentation, I guess I was looking for a more ‘involved’ code example…
By the way I am running the server from async code, for which I couldn’t find documentation about (seem to fit right into Running a Bokeh server — Bokeh 2.4.2 Documentation).

The code I am using for testing things is over there :

After a bit more testing it seems that calling once document.add_periodic_callback() with a “composed” callback to add callbacks, works as expected. But calling multiple times document.add_next_tick_callback(), only one is actually taken into account…

I m not sure if I am holding this the wrong way, or if there is a problem hidden somewhere…
Thanks for the help.

I’m sorry I don’t have time to dig in to the linked code just now but I wanted to drop in a link to this example that patches three separate sources:

https://github.com/bokeh/bokeh/blob/master/examples/howto/patch_app.py

It’s a “standard” bokeh app with sounds like it is different from your usage but perhaps it is useful.

I don’t really know anything about this:

But calling multiple times document.add_next_tick_callback() , only one is actually taken into account…

The very most helpful thing here would be a tiny minimal reproducer so that there is no ambiguity about the behavior being observed.

I attempted to write here a minimum example to reflect my usecase:

I m focusing here only on the usage of stream, attempting to drive the webpage update…

Doing this, I encounter yet another strange behavior : it seems that, somehow, the DataTable on the webpage actually “grabs” all the stream updates… so I am still very confused…
I couldn’t find any issue with the python code so far, the callbacks are called as expected.

Anyway I hope this will help as a clean codebase.

@asmodehn Python scope / lambda capture does not work the way you seem to be expecting:

In [5]: funcs = []

In [6]: for x in ("foo", "bar", "baz"):
   ...:     funcs.append(lambda: print(x))
   ...:

In [7]: for f in funcs: f()
baz
baz
baz

That would explain why only one CDS gets updated.

Is there a reason not to put the CDS loop inside a single lamba, instead of at the top level? If so, your best bet is probably to use functools.partial to bake in an argument value to the callback.

Ah, thanks a lot.
Something I really didn’t expect and would’ve spent weeks looking for.
https://docs.python.org/3/faq/programming.html#why-do-lambdas-defined-in-a-loop-with-different-values-all-return-the-same-result

Indeed now the different behaviors I saw do make more sense…
I could fix it with the fix mentioned in the FAQ, using a default variable to save the value.

I don’t think I can put the CDS loop inside a single lambda, as each of these must be called for potentially different documents (created for potentially different web requests), and I don’t know when their next tick is going to happen…
From the API at least, it seems to me that:

  • A CDS belongs to one document
  • Each document has a (potentially different) tick.
    So I must schedule each CDS update into potentially different ticks…

This is true—in fact every Bokeh model (e.g. Plot, Range1d, whatever) can only belong to exactly one Document. But the converse is not true, of course. A Document can have as many CDS objects as it needs.

Each document has a (potentially different) tick.

I don’t really understand what this is suggesting. In the code you provided, for any given user session, all the CDS there are all always in one single Document.

First part : the Document

Each document has a (potentially different) tick.

I don’t really understand what this is suggesting. In the code you provided, for any given user session, all the CDS there are all always in one single Document.

I guessed that each webbrowser connecting to the server potentially creates a document → multiple connections creates multiple documents (ie instances of Document).

I understand this is what you mean, more precisely, by “for any given user session”.

From the doc:

Sessions have a 1-1 relationship with instances of bokeh.document.Document : each session has a document instance. When a browser connects to the server, it gets a new session

So I potentially have multiple documents to deal with when streaming to update the plots. Even in the code I provided, multiple webbrowser connections will create multiple documents, right ?

Note my Clock class wants to abstract as much as possible any server internals, the goal is just to encapsulate some dataframe and provide (debug-style) visualization of that data when someones connects to the server.

Second part : the tick

The document has a add_next_tick_callback() method, so I guessed that each document potentially has a different tick, and therefore has this method (instead of having it somewhere else - like on the server ?).

I am using only one server process in this example here (my immediate usecase), but I am looking for a “generic” design given bokeh design, even if I am not sure of the best way to run multiprocess yet…

But, given that bokeh server is single-threaded, maybe there is a simpler/better way to get the different ticks in my case ? maybe scheduling callbacks on the tick of the server, via https://docs.bokeh.org/en/latest/docs/dev_guide/server.html#lifecycle for example ?