I have a large data set ColumnDataSource and I would like to append data to it and have the figure replot just the new data, not the entire data set.
I am doing a lot of things in Java, so I don’t have a minimal example to provide. I hope I have a few simple questions that with answers can point me in the right direction…
I think I have two approaches,
I believe I want to engage the stream action of ColumnDataSource. The documentation and other posts indicate that after calling ColumnDataSource.stream with new data one needs to refresh using push_document, but since I rendered the page using components I don’t know where to get the handle to call push_document on the Python side. In Java I do have a handle to Bokeh class. Is there a push_document function I can call from the Java side?
Currently I update the data in the ColumnDataSource on the Java side, and after, I call ColumnDataSource.change.emit() and that causes a replot of all the data. This has worked great for me up to now. But when the data gets so large, the replotting of all the data takes away from the user experience. Is there a way I can trigger just a replot of the new data that was just added?
2A) I see that the ColumnDataSource has a streaming object. Is there some way to use that to stream data into the ColumnDataSource? For example, adding data like ColumnDataSource.streaming.data(new stuff), and then call ColumnDataSource.streaming.emit()…
To be more clear, here are some snippets that I use to do things,
This is how I get the element, where name is the name of the Bokeh widget that is assigned to it,
for (let i in Bokeh.index) {
bokeh = Bokeh.index[i];
var el = bokeh.model.document.get_model_by_name(name);
console.log(bokeh, el);
So, where I said above, ColumnDataSource.change.emit(), its really el.change.emit(), where el is the object found by name.
When I add data to the ColumnDataSource, it looks like this, where param is a dict of data and flags,
if (param.hasOwnProperty("stream") && param.stream != null) {
if (param.hasOwnProperty("clear") && param.clear) {
// replace data case
for (var col_data in el.data) {
el.data[col_data] = param.stream[col_data].slice(0); // replace with new data
}
} else {
// append data case
for (var col_data in el.data) {
for (var item in param.stream[col_data]) {
el.data[col_data].push(item);
}
}
}
el.change.emit(); // this does a redraw of all the data, whether replace or append
}
You’ll have to clarify this question. If you mean: “is there any way to do partial canvas draws?” then the answer is no. Every redraw request paints the entire canvas in full, e.g every ghlyph is requested to re-render all of itself. There is no machinery for incremental drawing or damage regions or anything like that. You might take a look at the WebGL support if actual rendering is a bottleneck for you (though YMMV, webgl support only covers a subset of glyphs).
I have tried webgl but that caused some other issues which I reported. Specifically with tabs. And I also understand webgl is a work in progress.
Okay. It just occurs to me that I could have tried an AjaxColumnDataSource which has the append/replace field and see if indeed that causes a redraw of all the data when append is used. Your answers above are telling that it does do a complete redraw.
@Martin_Guthrie that looks really cool! FYI there’s no bounty system on the Discourse, I think you’d have use Stack Overflow to specifically try something like that
A possible (brute force) work-around is splitting up the data among many ColumnDataSource, for example I tried 100 of them. When data comes in, put it into the correctly indexed ColumnDataSource and plot/emit that one. With 1000 ColumnDataSources the initial loading of the page takes a long time and there is a refresh, … but it still “worked”.