Bokeh JS performance

Hi,

I’ve been experimenting with bokeh for quite sometime now and wanted to make my app to production. However, when I run auditing (via Chrome’s dev tool), I get:

Minimize main-thread work 59.4 s
Consider reducing the time spent parsing, compiling and executing JS. You may find delivering smaller JS payloads helps with this.

Category Time Spent
Script Evaluation 40,933 ms
Style & Layout 14,146 ms
Other 2,226 ms

Is there anyway I could optimize the js performance? I’ve already enabled minified and using inline javascript settings. I’m using webgl plotting.

Are there any more knobs I need to tweak in order to make the page more snappy?

Hi @codeyman it’s not really possble to say much without knowing some specifics of what you are actually doing. For instance, none of the examples at https://demo.bokeh.org/ spend 40 seconds or anything close to that, for any reason. Any suggestions would require knowing what exactly you are doing, that is unique to your situation, to incur those high costs, as well as relevant versions in case you are not using the latest Bokeh version as well.

Hi @Bryan,

The layout is as follow:
Tab1:
1 Line plot
1 DataTable
1 DataTable
Tab2:
1 Mosaic chart made by composting around 15 rectangles
1 div of text
1 Histogram
Tab3:
1 Cluster plot made with circle
5 line plots
2 line plots with Band
Tab4:
1 Datatable

The plot on demo page are all few single page interactive plots. The backend right now is fetching all the data in parallel. JS might be taking time to render so much data, I was looking for a way to disable Tab loading unless not active, but I couldn’t find any.

I’m using bokeh 2.0.2 with python 3.6.9.

@Bryan, Please also note that the audit tool in chrome throttles CPU/Network. The visible time is closer to 20s (which is still a lot).

@codeyman I would not offhand expect ~20 seconds of work for that with Bokeh 2.x. [1] How much data is in these plots and tables? Is there any way you can construct an MRE (e.g. with fake or synthetic data) that can actually be run and investigated? That’s really what would be needed to try to say anything concrete.


  1. as a comparison here is a page with 260 plots that renders in about a second https://blog.bokeh.org/static/release-1-1-0/large-grid.html ↩︎

Actually, as a quick check, can you see if there is any major difference from removing the Div ?

The div is not populated unless some tile is pushed in the mosaic chart. (I’ll try to remove it and report if I see any issues).
I found some rough numbers… the data is coming from bigquery with limits in place. The data is coming from around 5-6 tables, totalling around 3-4MB and around 10,000 rows in all.

Hard for me to construct an MRE with the scope of data that we have :frowning:

Removing div doesn’t solve the problem.

You could generate random data of similar size.

I’ll try and get back to you guys.
Thanks.

1 Like

@p-himik finally after a day of work, I was able to anonymize the data and obfuscate the code enough to share. How do I share this with you?

I rendered the 260 plot using the audit tool(chrome audit tool) and it took 13s (which is still better than my app). There are few things that are painfully obvious:

  1. There needs to be progressive updates. Just a quickly displayed viewport goes a long way to increase the “snappiness” perception by the end user.
  2. Lazy updates (this can happen via 1). Only pull what is required.

Unfortunately those kinds of things would require some pretty drastic changes at fundamental levels AFAIK. [1] Which doesn’t mean they can’t or shouldn’t be done, but this is an area where new contributors with the right experience and opinions could help make that happen much faster.


  1. When Bokeh was first created the scope of things was “some plots in a Jupyter notebook” or “Shiny for Python” (which generally have very simple single page “plot and a few widgets” UIs) and things in that scope are served very well. Folks have been pushing to use Bokeh for ever more sophisticated things though, which is awesome, but long story short, we could use help to expand those boundaries. ↩︎

1 Like

I understand. I gravitated towards Bokeh because I have 0 JS skills :slight_smile:. (and I didn’t want to be dependent of java/ui team). If there was a corporate arm to the project (like plotly/dash), that could have paid for infra improvement. Thanks though.

Well, I don’t mean to come off as too discouraging. :slight_smile: We are continually improving and there are hopefully some opportunities this year for increased funding and support. But it will always be the case that more can get done faster with more hands on deck!

Copying over a discussion that accidentally happened in PM. cc @mateusz and @p-himik


Hi @Bryan, I genuinely tried look at the code to fix but giving up realizing that typescript was too different from anything I do. (I’m mainly a C(network OS)/Python(ML) guy).

So the next best thing I could do is document my findings in the hopes that you might find it useful. Please do take a look at https://drive.google.com/open?id=1_1wRwPD1uDKqKg0FoBWd9dtdxJrnnAZ7

SolutionReply

BryanCore Team Member

3h

@codeyman That’s a bit of an apples and oranges comparison in a potentially important regard. Those results are using CDN loading for plotly but not for Bokeh. By default Bokeh loads resources from the Bokeh server itself, if you want to use CDN to load BokehJS (e.g. to take advantage of caching in the browser after a first load) you can set the env var:

Copy to clipboard

BOKEH_RESOURCES=cdn bokeh serve ...

I’ve thought lately it might be worth discussing changing the default to use CDN, but if you can compare results with both using CDN that possibly might add evidence to support such a change.

Reply

codeyman

1h

In my local tests I was using bokeh inline and was running on local host. Loading files was not an issue. The Performance tests done locally point to JS optimization issues.

I ran the audit tool on https://blog.bokeh.org/static/release-1-1-0/large-grid.html. The results are similar (this fetches from cdn). We are using only 39% of the JS code that is actually downloaded. If you had a table, then there will be a lot more code that would have been unused. (You can run it using Chrome->Dev tool-> Coverage and audit tool).

Reply

BryanCore Team Member

4m

We are in fact looking into splitting BokehJS in a more fine-grained fashion, but there are definitely limits to how far this can go. As one example: there are dozens of glyph types. Most plots will only use a few. But I can’t imagine any scenario where e.g. things were split up with a distinct separate file per-glyph type. But we might be able to partition at a level of “common” and “less common” glyphs. A similar statement is true of all the interactive plot tools. At the terminal end: splitting BokehJS up in to hundreds of individual tiny files for each distinct model just to optimize this utilization percentage is not a maintenance burden we can support.

I completely agree with you here. Perhaps a performance test to load up a version of a test dashboard with tabs/tables/plots with manually chopped up JS to determine the upper limits of performance gain (with async/defer)?

or maybe a mode to dynamically generate js files based on glyphs included (by delaying the typescript compile until bokeh serve is called)? With most deployments inching towards cloud deployments like https://cloud.google.com/solutions/bokeh-and-bigquery-dashboards, a production system would probably prefer using it’s own cdn for it’s specific usecase.

In anycase thanks for indulging in my flights of fancy :slight_smile:. I understand the resource constraints.

@mateusz is definitely interested in supporting custom bundle compilation and I think that would be a great tool for the folks that need/want this level of control

Not sure if at all related, but Google Closure Compiler has very good Dead Code Elimination algorithms. I think WebKit has something called tree shaking but I know little about it. Of course it would mean having to pass (already compiled?) BokehJS through such a tool along with the user code, so it’s a task for users and not developers.

I think I’m able to get some respite. Instead of loading one Tab() via bokeh server, I broke it into separate layouts and loaded them via a jinja template:

{% extends base %}

{% block title %}Test tab{% endblock %}

{% block postamble %}
    <link rel="stylesheet" href="https://www.w3schools.com/w3css/4/w3.css">
<script type="text/javascript">
function openvtab(evt, tabname) {
  var i, x, tablinks;
  x = document.getElementsByClassName("vani-tab");
  for (i = 0; i < x.length; i++) {
    x[i].style.display = "none";
  }
  tablinks = document.getElementsByClassName("tablink");
  for (i = 0; i < x.length; i++) {
    tablinks[i].className = tablinks[i].className.replace(" w3-red", "");
  }
  document.getElementById(tabname).style.display = "block";
  evt.currentTarget.className += " w3-red";
}
</script>
{% endblock %}

{% block contents %}

      <div id="pane1" class="vani-tab">
      {{ embed(roots.pane1) }}
      </div>
      <div id="pane2" class="vani-tab" style="display:none">
      {{ embed(roots.pane2) }}
      </div>

    <div class="w3-sidebar w3-bar-block w3-light-grey w3-card" style="width:130px">
      <h5 class="w3-bar-item">Menu</h5>
      <button class="w3-bar-item w3-button tablink" onclick="openvtab(event, 'pane1')">Alarms</button>
      <button class="w3-bar-item w3-button tablink" onclick="openvtab(event, 'pane2')">Action Items</button>
    </div>

{% endblock %}

The benefit might be because tab data is big and takes some time to download while the broken down 4 layouts (modified to 2 in this example) is not. I’ll work on making the bokeh.embed_items done via click rather than in one shot (and cache if already done) and try to get more juice out of it.

Time to interactive have gone from 33s to 11s, and max input delay has gone from ~3s to ~200ms.

1 Like