How to handle dynamically changing data (similar to the MultiLineGlyph)

jundl77 · January 22, 2020, 7:44pm

TL;DR: How do you best deal with real-time dynamic data where you would like to add and remove lines/glyphs from the plot while still remaining efficient (i.e. using the stream function).

MultiLineGlyph? Adding/Removing glyphs if that is possible? Other options?

Problem:

For a real-time web-app, I have a line per user on a graph, and as users come and go I would like to add and remove the line.

I’ve used the multiline object for this before, but it does not support streaming AFAIK, and if possible I would like a more efficient implementation that does not send the entire data over the wire on every update (as happens with setting source.data directly if I understand correctly.

What options do exist in this scenario? Is it the multi-line object and forgo streaming, or are there entirely other options?

I’ve been trying to dynamically add and remove step glyphs from my figure as I mentioned, and while adding seems to somewhat work, removing does not work at all. Thank you for your help!

More use cases

Another use-case where this could be very useful is to re-use a graph instead of creating a new one (e.g. have a drop-down menu that you choose a data type to display). This would be a solution for another problem, where too many graphs seem to massively slow down the rendering of the document.

Updates:

[1] In my case, I am talking about dynamically adding/removing ~20 glyphs.

Bryan · January 22, 2020, 7:47pm

Some preliminary questions: Is there any total maximum number of users that can be regarded as absolute? If so, is it ~10 or ~100 (or more).

jundl77 · January 22, 2020, 7:48pm

In my case, we are talking around something in the range of 20 I would say.

Bryan · January 22, 2020, 7:53pm

Further question: is the line data for each user potentially a different length? Or will each users line always be an identical number of points? I assume the former but worth validating.

jundl77 · January 22, 2020, 7:58pm

In my case I make sure that every line is the same length indeed. They all share the same x axes, and for the y axes I add np.nan to any line that does not have enough points.

Bryan · January 22, 2020, 9:34pm

OK In that case I would really suggest something this:

One CDS with the common “x” column and 20 columns like "User1", "User2",etc.
20 line glyphs up front, that point to "x" and one of the “UserN” columns

You can start off making all the glyphs visible=False. Then as a new user comes in, associate them with one of the “UserN” columns, updated that data column with their data, e.g.

source.data['User2'] = new_data_for_some_user

and also make the glyph visible=True. Conversely when a user leaves, set the associated glyph back to visible=False. You’ll need to do a little bookkeeping to keep track os which of the “UserN” columns is currently in use, and which are “free”.

Alternatively you could try with multi_line by making a CDS that starts off with (and always has) 20 sub-arrays. You can set the alpha for each sub-line, setting to 0 for “unused” lines. For updating, you can use source.patch to update the individual sublists. (I think this works with multi_line data)

Either way would avoid the hassle of adding/deleting glyphs, which is a very heavyweight operation.

jundl77 · January 27, 2020, 8:53pm

It’s been a few days, and I wanted to give a brief update. What I do want is a truly dynamically changing graph, as well as a streaming based solution. I have something working, however it is a bit shaky. Here is what I have done.

I have used a MultiLineGlyph. Let me divide the solution into two cases:

Case 1: Number of lines have changed - send snapshot
Whenever there is a new user coming or leaving, I send the entire data over the wire again, as such:

self.source.data = new_data

Thus there is no streaming in this case. However, this does not happen too often (say on a minute basis, and probably more like an hour basis), so re-sending the data is fine in my case. As you will if you keep reading, I have to do it anyway with my solution, for another reason.

Case 2: Number of lines have not changed - stream data
Now the problem is, how do you stream data? At the current point in time (bokeh 1.4), the MultiLineGlyph does not support the stream operation, at least not in a sensible way.

What is supported, however, is that patch operation, like @Bryan mentioned. Unfortunately, it turns out that the patch operation can only change the location of existing points on a graph, and it cannot append new points to the CDS.

This can be solved with a little bit of a dirty trick. At the start, I stream the data to the CDS by sending an empty window of np.nan points for the following 2 minutes. In my case, I can even be a bit clever and pre-compute my x-axis values, since I know at which rate my graph is updating (say 10fps → 1 point every 100 ms).

Now, for the next two minutes I only have to send the new points, always moving one of the hidden np.nan points via the patch operation to my desired y-axis value.

After the two minutes are over, I resend my data, again appending a new two-minute window of empty points. And thus you can go on and on, always streaming only the updates to your multi-line graph.

Downsides
I don’t have any hard evidence, and maybe @Bryan you can give me some feedback here, but I have a feeling that patching a graph is a relatively expensive operation on the browser side.

Doing this seems to scale quite well, however the graph becomes clunky and staggers when I try to move it around, or zoom in and out.

Anyway, this is my approach to solving this problem, for now at least until.

I hope I was able to help someone out there.