How to incrementally build a server page with multiple next tick callbacks?

hrbigelow · August 22, 2023, 10:17pm

Hi,

I have a Bokeh Server application that, on start up, creates a Figure and ultimately adds 10 Lines to it. Each line has about 300,000 points. The total set of work I need to do then is:

def build_figure(data, fig):
    # data = [ line1_data, line2_data, ... ]
    for i in range(10):
        cds = ColumnDataSource(data[i])
        fig.line('x', 'y', source=cds, name=f'line {i}')

curdoc().add_next_tick_callback(partial(build_figure, data, fig))

However, this takes several seconds during which the screen is blank and there is no visual indication of progress. Instead, I would like to

I would instead like to break this work into individual callbacks, one for each Line:

def build_line(line_data, name, fig):
    # line_data = { 'x': ..., 'y': ... }
    cds = ColumnDataSource(line_data)
    fig.line('x', 'y', source=cds, name=name)

for i in range(10):
    curdoc().add_next_tick_callback(
        partial(build_line, data[i], f'line {i}', fig)
    )

However, this doesn’t seem to solve the problem. It appears (not surprisingly) that all of the 10 callbacks are executed on the next tick.

I would like to somehow schedule these callbacks serially, so that only one runs at a time. What is the best practice for achieving something like this?

Thanks again,

Henry

Bryan · August 22, 2023, 10:24pm

I am skeptical that 3M data points s ever going to yield reasonable results, and would advise you to look at other solutions e.g. Holoviz+Datashader. This can generate Bokeh output, but use Datashader to render millions/billions of points server side, so that only a final rendered image needs to be sent ot the client (much less data on the wire).

However, if you do actually want to try to continue down this path, the way you will have to do it is to make the first next-tick callback responsible for calling add_next_tick_callback for the second callback, and the second responsible for the third, etc. all the way down the line. As you have discovered, it will not work to add them all “up front” (that’s just how the underlying Tornado event loop APIs function).

hrbigelow · August 22, 2023, 11:35pm

Wow - never knew about holovis / datashader. That seems like the way to go because actually 3M points is going to be on the low end. With the extra speed I may not need to do the chained next tick callbacks. I’ll look into it.

Thanks again,

Henry

hrbigelow · August 23, 2023, 9:17pm

Hi Bryan,

Actually, would you have any suggestion for what data sources to use for hvplot? It says it supports Pandas, Dask, XArray, Rapids cuDF, Streamz, Intake, Geopandas, NetworkX and Ibis.

My particular requirement at the moment is the following:

I have a dataset that periodically (every 10 seconds or so) receives more data points, all representing points in one or more lines of a line plot. The user should then see the existing lines grow periodically, and sometimes, new lines appear.

At the moment, I’m assuming that the way to do this is to maintain some data structure such as a pandas DataFrame which periodically receives these updated data, either by augmenting an existing column, or adding a new column. And, then using df.hvplot(…).

However, I read that a DataFrame is not very efficient for appending data. So, I wondered, first, is it necessary to maintain the whole aggregate structure, or does hvplot have some internal data structure that it maintains and could be updated directly?

Or, if I must maintain that structure, what is the best choice? I looked at some of the others but it’s not quite clear. xArray seems to assume a rectangular structure for the data. Streamz seems quite heavyweight, but maybe it is the right choice? Dask seems to be higher level than pandas.

Any advice appreciated!

Bryan · August 24, 2023, 4:39pm

Actually, would you have any suggestion for what data sources to use for hvplot?

Although we collaborate at a high level with the holoviz folks, it is a separate project maintained by diferent people. I am not a developer or user of hvplot ^[1] so I don’t really have any experience to share. You might find the HoloViz Discourse to be a more focused venue for questions about hvplot and other HoloViz tools.

cc @Philipp_Rudiger @James_A_Bednar1

In fact I don’t “do” datavis or data science at all, really. I just like working on the tools ↩︎

hrbigelow · August 24, 2023, 8:43pm

Thanks Bryan - ahh, yes on reflection I probably should have read more before asking this, thanks for the links.

system · November 22, 2023, 8:57pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.