Best practices for reaching good performance when interacting with (many) plots

I have a performance issue I would like to share with the hope to get some pointers about how to situation could be improved.

Ideally I would get a list of best practices related to performance when using Bokeh (do this, do not do this kind of list).

Scenario:

  • We want to display plots in three different tabs (let us call these A, B and C, marked with pink arrows in the picture).
    • Only the contents of a single tab is visible at a time (the selected tab).
  • Each of these tabs have their own CDS.
  • Each plot in a given tab uses an IndexFilter to decide what it should display.
  • To give an idea of the amount of data we are dealing with a CDS can look like these:
    • CDS column count: 3966.
    • CDS Row count: 69.
    • Plot count: 44.
  • The Plots are arranged in two columns (Bokeh layout).
  • The plots are created lazily, that is when a tab is selected.
    • When tab A is selected then the plots of A are loaded etc.

Current situation:

  • If we just select a single tab the interaction with the plots (like clicking on a sample) is okay.
  • If we select all tab once (so all plots have been created) the interaction degrades a lot.
  • It seems to me that if we pay a big price in terms of performance just by creating plots even though they are not visible.

Concrete question:

  • Can I somehow temporarily unlink/hide/disable (or similar) the plots for the tabs that are currently not visible?

I understand you might need an example to go deeper. That being said, I was hoping there are some general best practices related to performance that some of you are willing to share or link to?

2 Likes

Is there a reason why the plots are getting created dynamically instead of creating all plots upfront? Do the interactions influence the layout? From my experience best performance is achieved when only the data in the CDS is updated while the plots stay as is.

Can I somehow temporarily unlink/hide/disable (or similar) the plots for the tabs that are currently not visible?

Linking of layouts across tabs is the main culprit for performance degradation. Currently there’s no way to disable this behavior. I started a PR which makes this configurable.

We looked into creating plots lazily with the goal to improve performance. We could see that interactions were more smooth when we did not create all the plots up-front. Ideally we would like to create all data up-front without paying anything extra for hidden tabs.

Nice - we really appreciate this! Is there any short-term workaround for this? We are currently using Bokeh 3.8.2.

Side note:

Last I just want to share a finding from last week. We experimented with skipping the lazy load, create all plots up-front and use explicit hide operations. In one of our hidden plots (longitudinal from the screenshot above) we have (a lot of) lines. We realized that hiding the GlyphRenderer(s) for the lines during creation, and only showing them when a tab became active, had a significant impact on the initial load time. We currently also hide the GlyphRenderer(s) again once we leave a tab. Ideally we would want to avoid “hacks” like this, but maybe this information can be useful for someone.

Thanks a lot for the feedback!

2 Likes