I have a Bokeh server app which does the following:
Loads data from a database (~8,000 rows * 100 columns) into a column data source.
Creates a linked figure and table (figure shows 4 columns of the data, table shows 8), the table uses a 2nd column data source which is initialized with the required columns and 1 row of data per column.
I add a periodic_callback every 1000 ms which does two things:
-manually sets the y-axis range such that it fits to the data being show.
-if a row in the table is clicked, it will center the figure onto the corresponding data point
- There is also a “Select” widget which loads different rows into the table, corresponding with preset filters on the data.
Ex: If data loaded from the DB, saved in the column data source looks like this:
index | value | filter1
0 | 0.3 | 1
1 | 0.5 | 0
2 | 0.6 | 1
The select widget would have the preset filter “filter1=0” or “filter1=1”, which then renders the corresponding rows in the table. It works by converting the main CDS to a dataframe, filtering the df on a predetermined criteria, and then updating the table CDS to contain the new filtered data.
This all works very quickly after the first filtering is done. However, the initialization can be very slow.
A breakdown is roughly:
About 1 minute from launching the app to having data render on the figure. This is reasonable and I believe the added latency is from loading the data from the database
The first time my “select” widget is changed it takes 2-3 minutes to load the initial group of data into the table. After this initial 2-3 minute load any new updates happen very quickly, typically only taking 2-5 seconds. I have noticed that during this first change the periodic_callback stops running.
The database should only ever get queried once on the initial load so I do not think that is causing the issue. I assume that if the plot is already rendered then all of the data is loaded, unless there is some sort of lazy evaluation Im unaware of behind the scenes.
It is very unpleasant to have the plot render and function normally, then have it freeze for 3 minutes while the first table update is done.
I am hoping to find that:
I am missing some sort of optimization which is causing this initial filter to be slow.
If there is someway to automate running this initial filter on load it would be more pleasant for the user. Rather than: wait 1 minute for launch, appears to be working, select filter, wait 3 minutes for filter, use as normal it would create the flow: wait 4 minutes for launch, use as normal. This is not ideal, but I think would be much better than the current state.
Any help is appreciated.