Update figure/ColumnDataSource with streamed data from Java

Martin_Guthrie · September 11, 2020, 8:27pm

I have a large data set ColumnDataSource and I would like to append data to it and have the figure replot just the new data, not the entire data set.

I am doing a lot of things in Java, so I don’t have a minimal example to provide. I hope I have a few simple questions that with answers can point me in the right direction…

I think I have two approaches,

I believe I want to engage the stream action of ColumnDataSource. The documentation and other posts indicate that after calling ColumnDataSource.stream with new data one needs to refresh using push_document, but since I rendered the page using components I don’t know where to get the handle to call push_document on the Python side. In Java I do have a handle to Bokeh class. Is there a push_document function I can call from the Java side?
Currently I update the data in the ColumnDataSource on the Java side, and after, I call ColumnDataSource.change.emit() and that causes a replot of all the data. This has worked great for me up to now. But when the data gets so large, the replotting of all the data takes away from the user experience. Is there a way I can trigger just a replot of the new data that was just added?

2A) I see that the ColumnDataSource has a streaming object. Is there some way to use that to stream data into the ColumnDataSource? For example, adding data like ColumnDataSource.streaming.data(new stuff), and then call ColumnDataSource.streaming.emit()…

p-himik · September 11, 2020, 9:05pm

I’m confused. Are you sure you’re using Java and not JavaScript? If so, how exactly do you call ColumnDataSource.change.emit() in Java?

Martin_Guthrie · September 11, 2020, 9:15pm

Sorry, yes, Javascript.

To be more clear, here are some snippets that I use to do things,

This is how I get the element, where name is the name of the Bokeh widget that is assigned to it,

            for (let i in Bokeh.index) {
                bokeh = Bokeh.index[i];
                var el = bokeh.model.document.get_model_by_name(name);
                console.log(bokeh, el);

So, where I said above, ColumnDataSource.change.emit(), its really el.change.emit(), where el is the object found by name.

When I add data to the ColumnDataSource, it looks like this, where param is a dict of data and flags,

            if (param.hasOwnProperty("stream") && param.stream != null) {
                if (param.hasOwnProperty("clear") && param.clear) {
                    // replace data case
                    for (var col_data in el.data) {
                        el.data[col_data] = param.stream[col_data].slice(0);  // replace with new data
                    }
                } else {
                    // append data case
                    for (var col_data in el.data) {
                        for (var item in param.stream[col_data]) {
                            el.data[col_data].push(item);
                        }
                    }     
                }
                el.change.emit();  // this does a redraw of all the data, whether replace or append
            }

Bryan · September 11, 2020, 9:17pm

You’ll have to clarify this question. If you mean: “is there any way to do partial canvas draws?” then the answer is no. Every redraw request paints the entire canvas in full, e.g every ghlyph is requested to re-render all of itself. There is no machinery for incremental drawing or damage regions or anything like that. You might take a look at the WebGL support if actual rendering is a bottleneck for you (though YMMV, webgl support only covers a subset of glyphs).

Bryan · September 11, 2020, 9:22pm

I think you mean push_notebook and this is only the case:

In Jupyter notebooks
when you want to update data from Python
and are not embedding a real Bokeh server app

i.e. push_notebook is an explicit mechanism to sync Python → JS.

If you are in JS, updating from JS, you can just call stream directly there is nothing else to do.

Martin_Guthrie · September 11, 2020, 9:22pm

Thanks.

I have tried webgl but that caused some other issues which I reported. Specifically with tabs. And I also understand webgl is a work in progress.

Okay. It just occurs to me that I could have tried an AjaxColumnDataSource which has the append/replace field and see if indeed that causes a redraw of all the data when append is used. Your answers above are telling that it does do a complete redraw.

Martin_Guthrie · September 11, 2020, 9:28pm

FYI, I been using bokeh for a long time, I think I go back to version 0.11… So some of my methods are carry over from those days…

If you want to see what I built, check out p1125 pre demo 2 - YouTube, skip to the 3min mark.

I also have another question, if I want to put a bounty on an issue, is there a proper way to go about that?

Bryan · September 11, 2020, 9:31pm

@Martin_Guthrie that looks really cool! FYI there’s no bounty system on the Discourse, I think you’d have use Stack Overflow to specifically try something like that

Martin_Guthrie · September 11, 2020, 10:53pm

A possible (brute force) work-around is splitting up the data among many ColumnDataSource, for example I tried 100 of them. When data comes in, put it into the correctly indexed ColumnDataSource and plot/emit that one. With 1000 ColumnDataSources the initial loading of the page takes a long time and there is a refresh, … but it still “worked”.