Layout guidelines/tips for rapidly rendering many plots

Hi,

I have a grid-based layout for graphical presentation of information and user interaction to dynamically change settings and observe the effects in a multiclass classification problem. Each class’ contribution to the overall behavior is associated with one of the plots in the grid layout. Moreover, each element in the grid comprises multiple bokeh model primitives, e.g. a pair of linked figures and possibly a UI control such as a slider.

I observe experimentally that when making interactive changes to a given class’ models, the rendering is halting and slow to respond, taking several seconds for a problem with 30 - 40 classes (and thus 30 - 40 grid elements).

I am fairly certain that the problem is with re-layout of the display. Moreover, the underlying data are being updated efficiently with data patches, so I’m not pushing around large amounts of data. In case its relevant, the implementation is a bokeh server realization.

I have reviewed similar prior threads, e.g.

https://github.com/bokeh/bokeh/issues/6294

And dispositions that work addressed this in recent updates to the bokeh application, e.g.

https://github.com/bokeh/bokeh/pull/8085

With that background, I am wondering if there are pointers given insight into the underlying codebase or model implementations that can be followed to make things generally more responsive.

Specific questions that come to mind …

Are certain model types more prone to inefficiencies in this area?

Are certain sizing attributes (and at what level) preferred or able to prevent re-layout of the entire UI when only one element of a grid is being refreshed at any given time?

Thanks

Hi @_jm it’s not really possible to say anything without specifics (e.g. versions) as well as some code to actually run and experiment with. If you can construct a (complete) toy example that is representative of your usage and what you you are seeing, I’m happy to try to take a look.

I am using bokeh version 2.0.1. I will try to distill my application down into a generic, basic problem.

Thanks.

1 Like

Here’s a discussion that might help you debug the issue: How can I improve the perfomance updating data working with many plots and tabs?

My current guess is that 40 plots with a lot of data on them just take a long time to render, that’s it. If you’re using the webgl output backend, try using canvas instead.

Thanks for the pointer to that link. I had not seen it in my initial investigation of similar issues.

I am already using the canvas backend at create-time for my figures.

The point about large amounts of data is certainly appreciated. However, I believe I have organized the data sources and glyphs as efficiently as possible given the end-use requirements. And things are smooth and lag-free when I have a few plots in my grid.

The referenced link mentions that a source of slow-down there is that certain user actions are indeed affecting multiple plots. That’s not the case in my application, as individual actions only affect the related graphics in the associated row/column element of a grid.

So I was hoping there was something I could do in how I organize things so that bokeh is only re-rendering that specific row/column element and not re-render the entire page on each action. (I have no direct evidence that is what occurs from profiling or such, but anecdotally that appears to be happening.)

How do you achieve this? Is there a data source per each row/column? Or is there a single data source and multiple filters/expressions/transforms/views?

Hi,

There are separate data sources for each row/column.

To make the discussion a bit more concrete, see the attached screenshot for a smaller sized problem (three classes in a multiclass classification task).

Each class has an accuracy line comprising 1000 points that give accuracy as a function of operating point (x-axis value in the range [0,1]). There is a data source for a line and it is unique to that class. Its intentionally separated from the data sources for the other classes. The data source is created from a pandas dataframe initially. And it doesn’t get updated even when user interactions are applied. It serves as a line for visual reference.

Each class also has a glyph/marker denoting the selected operating point. That is also a separate data source, and is only a single point column data source created initially from a pandas dataframe. It is updated using a transform to ensure that the accuracy score is constrained to be on the reference line mentioned above.

So, when a user manipulates the operating point, the following things get updated: (1) the operating point markers (triangle and inverted triangle in its row/column element of the grid); (2) the hover tooltip of the inverted triangle, again only in that grid row/column element; (3) the decimal value at the top right above the plot, again only in that grid row/column element.

Additionally, the markers in the overall plot at the top of the screen is updated based on the average results from the three classes below. There is always only one overall plot regardless of the size of the problem (three classes or forty classes as in the “real world” test problem I am considering).

The problem shown here with three classes (Class A, B, C) is smooth and responsive. I can change the operating point and see the effects instantaneously for all intents and purposes.

In a 40-class problem, the observed behavior is that the operating point marker updates responsively (no discernible lag), but several seconds pass before I can see things in the associated hover tool when floating over the marker; similarly if I try to make changes to other operating points I need to wait a few seconds before I can.

I cannot say anything conclusive without the code.
I’ve come up with this simple example - and here, if you select some dots, only the affected plots are changed:

from bokeh.io import show
from bokeh.layouts import gridplot
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure

N = 5
M = 10


def mk_cds(i, rev):
    x = range(M)
    if rev:
        x = reversed(x)
    x = list(x)
    return ColumnDataSource(data=dict(x=x, y=[y + i for y in range(M)]))


r2cds = {r: mk_cds(r, False) for r in range(N)}
c2cds = {c: mk_cds(c, True) for c in range(N)}

items = []
for r, r_cds in r2cds.items():
    row = []
    for c, c_cds in c2cds.items():
        p = figure(plot_width=250, plot_height=250, tools='box_select', title=f'R: {r}; C: {c}')
        p.circle('x', 'y', source=r_cds)
        p.circle('x', 'y', source=c_cds, color='red')
        row.append(p)
    items.append(row)

show(gridplot(items))

Also, if the plots in your work are usually as simple as the ones on your picture, then I cannot see how Bokeh can lag even if you have 40 of them. There’s barely something to render.

Thanks.

Upon investigation, the lag is associated with the HTML divs in the upper-right that gives the score at the selected point in each class’ display. If I disable updates to that, the UI becomes completely responsive for large-ish problems with 40 classes.

I am updating the div in a callback by changing its text attributes. Here’s the relevant lines of code, albeit without any surrounding context of data / model definitions. The _hdr variable here is of type bokeh.models.widgets.markups.Div, and is maintained as a member of a custom class definition in my app.

_hdr = UIx[class_num].graphics.header.div
_hdr.text = uiph.div_text.format(self.class_names[class_num], _J[class_num])

Other information that is perhaps relevant; each class’ display is a 2 x 2 grid which comprises a spacer (upper left), the offending div (upper right); the main linked-plots (lower left); and a small information plot under the (“i”) icon/image which has a hover tool to pop up additional data about classes (lower right).

Thanks again.

When you change Div.text, the compute_layout is called on the whole document root. I imagine that’s the culprit with that many items on page. Try profiling the code just to confirm, if you want.

One way to fix it would be to create a custom model that renders what you want but that has a specified - fixed - width and that doesn’t trigger compute_layout when its text (or whatever name you choose) property is changed.

1 Like

This would probably help, I had wanted to get it in for 2.0 but it didn’t happen:

1 Like

Very helpful. And I was testing with Chrome as well.

Thanks. I will read up on the custom model implementation and use that to address my particular case now that we know where the main bottleneck is.

Hi,

I am not a JavaScript programmer. With that background, I tried to follow the recommendation to use a custom-model referencing the extending-bokeh concepts in the bokeh 2.0.1 documentation (Extending Bokeh).

The following are the TypeScript and class definition for a barebones Div that isolates updates to the div from triggering a re-layout. I can see the new model instances being used in my application, but the problematic behavior remains.

I am certain that my attempt at a quick minimal change is the source of the problem. I was hoping that following the super-class method’s call to connect_signals()with a subsequent call to register a different model-change callback would supersede the original model-change callback.

TypeScript (iso_div.ts)

import {Div, DivView} from "models/widgets/div"

export class IsoDivView extends DivView {
    connect_signals(): void {
        super.connect_signals();
        this.connect(this.model.change, () => {
            this.render();
            // Isolate root layout from div changes
            // Ref. https://discourse.bokeh.org/t/layout-guidelines-tips-for-rapidly-rendering-many-plots/5119
            //this.root.compute_layout(); // XXX: invalidate_layout?
        });
    }
}

export class IsoDiv extends Div {
  static init_IsoDiv(): void {
    this.prototype.default_view = IsoDivView
  }
}

Python model (iso_div.py)

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
"""
from bokeh.models import Div

class IsoDiv(Div):
    __implementation__ = "iso_div.ts"

You cannot reuse Div in any way, or Markup (its parent class) - because MarkupView calls compute_layout() in its connect_signals(). Just get what you need from Div and Markup and create a custom model that’s based on Widget.

Thanks.

I was able to use the following TypeScript and python to extend bokeh for my specific scenario.

I basically just commented out the super.connect_signals() call in the initial attempt at a custom Div. I presume it works b/c the Widget class did not do anything in its connect_signals() method that I could see, thus enabling me to avoid a call to super.connect_signals().

With this change, my UI is smooth and instantly responsive for a 40-ish class problem that I am interactively manipulating in a grid.

TypeScript (iso_div.ts)

import {Div, DivView} from "models/widgets/div"

export class IsoDivView extends DivView {
    connect_signals(): void {
        // ***
        // Isolate root layout from div changes
        // Ref. https://discourse.bokeh.org/t/layout-guidelines-tips-for-rapidly-rendering-many-plots/5119
        //  * comment out super.connect_signals(), this might work b/c (Widget,WidgetView) has corresponding
        //    do-nothing void method ???
        //  * comment out this.root.compute_layout() to prevent re-layout of entire page if div model changes
        // ***
        //super.connect_signals()
        this.connect(this.model.change, () => {
            this.render()
            //this.root.compute_layout(); // XXX: invalidate_layout?
        })
    }
}

export class IsoDiv extends Div {
    static init_IsoDiv(): void {
        this.prototype.default_view = IsoDivView
    }
}

Python (iso_div.py)

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
"""
from bokeh.models import Div

class IsoDiv(Div):
    __implementation__ = "iso_div.ts"