Plot stops updating + questions related to updating logic

tvdhauwe · July 16, 2021, 3:48am

I’m developing a component to display images in a grid. All the images are displayed as one bokeh model which is rendered in a panel Bokeh pane. Panel is used for callbacks and widgets.

I had an original implementation of this which rendered every image using holoviews in its own panel pane, but it was horrendously slow to update. It literally took about a second per image to update, and you could see it updating the images one by one. I then switched to using bokeh but still rendering each image individually and it was still very slow. After some research I discovered a post on panel’s discourse describing a similar problem where one of the panel developers commented it was likely due to Bokeh’s layout engine. So I set out to create something using bokeh only and taking care not to trigger a relayout or update something unnecessary.

The logic I’m using is that I have a single ColumnDataSource, and a list of figures that can only grow. When a caller sets the list of images, the following happens:

If there are more images than figures, I create additional figures. If there are fewer, I set the visible property of the extra figures to False. All figures have a CDSView with an IndexFilter pointing to a single image. This index never changes, as the index is simply the index of the figure in my list of figures.
Update ColumnDataSource.data
Update the frame_width, frame_height, x_range and y_range of each figure
If new figures were created, merge their toolbars with the main toolbar (like gridplot with merge_toolbars=True, but dynamic) and add them to the GridBox.

What doesn’t work is the following:
If a caller provides a list of images that’s smaller than any previously provided list, any updates to the list of images doesn’t render afterwards. If you run below code, try this sequence:

change start index → OK
increase num images → OK
change Ncols → relayout OK
decrease num images → OK
change start index → this stops working after step 4 (also if you skip steps 1-3)

I’ve really been struggling with why this doesn’t work, and I’m not sure whether I’m doing something wrong or whether I’ve stumbled upon a bug. If I set a breakpoint at the end of all my updates and run show(self.view.object) there, it shows the current state of the plot in a new browser window. If I do the same on step 4 above, I get an empty page (actually there was a point while I was messing with this where this showed the correct plot, but the main plot connected to the server didn’t update). I’ve also been checking the javascript console of my browser and I can see the client receive all the updates as I would expect them.

Since I’m posting a pretty lengthy example below, I have a few extra questions I would appreciate some help with too:

In _update_toolbar(), I first tried updating the toolbars and tools attributes of the ProxyToolbar object. This didn’t activate the tools on any new figures that were added. However if I pass multiple images into the constructor, the tools do work for all those initial figures. I don’t really understand what’s going on there since both are using the same code. As you can see, I ended up creating a whole new ToolbarBox to make it work.
I have a callback for my max_width and max_height parameters, which is supposed to change the size of all the images in the grid. In its current state this doesn’t work, but admittedly I haven’t spent a lot of time looking into why. If anyone has any pointers that’d be helpful. Do I need to update something about the GridBox, or create a new gridplot, or …?

I guess the overarching theme here is I don’t understand which attribute changes will actually trigger updates. From the documentation it seems like any attribute update on a bokeh object should do that, but that hasn’t been my experience. Any clarifications would be helpful, as well as suggestions on how to debug problems like this.

If you want to run this, pass a folder containing png images to the TestImageGrid constructor.

Code:

import itertools
import math
import cv2
import numpy as np
import param as pm
import panel as pn
from pathlib import Path
from bokeh.plotting import figure, Figure, gridplot, show
from bokeh.models import ColumnDataSource, LinearColorMapper, CDSView, IndexFilter, BoxZoomTool, \
    WheelZoomTool, ProxyToolbar, ToolbarBox


class ImageGrid(pm.Parameterized):
    images = pm.List(class_=np.ndarray, doc="List of images to display", precedence=-1)
    ncols = pm.Integer(default=3, bounds=(1, None), doc="Number of columns in grid")
    max_width = pm.Integer(default=300, bounds=(1, None), doc="Max width of a single image")
    max_height = pm.Integer(default=300, bounds=(1, None), doc="Max height of a single image")

    view = pm.ClassSelector(class_=pn.pane.Bokeh, constant=True, precedence=-1)

    _source = pm.ClassSelector(class_=ColumnDataSource, constant=True, precedence=-1)
    _figures = pm.List(class_=Figure, constant=True, precedence=-1)

    def __init__(self, **kwargs):
        super().__init__(view=pn.pane.Bokeh(gridplot(None)),
                         _source=ColumnDataSource(),
                         **kwargs)
        if self.images:
            self._update_figures()

    def _add_figure(self):
        index = len(self._figures)
        fig = figure(match_aspect=True, margin=10)
        color_mapper = LinearColorMapper(palette="Greys256")
        cds_view = CDSView(source=self._source, filters=[IndexFilter([index])])
        fig.image(source=self._source, view=cds_view,
                  image='image', x='x', y='y', dw='dw', dh='dh',
                  color_mapper=color_mapper)
        box_zoom = fig.select(type=BoxZoomTool)
        scroll_zoom = fig.select(type=WheelZoomTool)
        if box_zoom:
            box_zoom.match_aspect = True
        if scroll_zoom:
            scroll_zoom.zoom_on_axis = False
        fig.toolbar_location = None     # will be added to merged toolbar
        self._figures.append(fig)

    def _update_num_figures(self):
        num_figures = len(self._figures)
        num_images = len(self.images)
        if num_figures < num_images:
            for _ in range(num_images - num_figures):
                self._add_figure()
        for index, fig in enumerate(self._figures):
            fig.visible = index < num_images

    def _update_source(self):
        heights = [image.shape[0] for image in self.images]
        widths = [image.shape[1] for image in self.images]
        self._source.data = dict(
            x=[0] * len(self.images),
            y=heights,
            dw=widths,
            dh=heights,
            image=[image[::-1] for image in self.images]
        )

    @pm.depends('max_width', 'max_height', watch=True)
    def _set_image_dimensions(self):
        for index, (fig, image) in enumerate(zip(self._active_figures, self.images)):
            h, w = image.shape
            fig.x_range.update(start=0, end=w, bounds=(0, w))
            fig.y_range.update(start=h, end=0, bounds=(0, h))
            if h > w:
                fig.frame_height = self.max_height
                fig.frame_width = int(fig.frame_height * w / h)
            else:
                fig.frame_width = self.max_width
                fig.frame_height = int(fig.frame_width * h / w)

    def _update_toolbar(self):
        # Add to merged toolbar (see gridplot implementation)
        toolbars = [fig.toolbar for fig in self._figures]
        tools = list(itertools.chain.from_iterable([fig.tools for fig in self._figures]))
        # This doesn't work
        # proxy = self.view.object.children[0].toolbar
        # proxy.update(toolbars=toolbars, tools=tools)
        # This does work
        proxy = ProxyToolbar(toolbars=toolbars, tools=tools)
        self.view.object.children = [ToolbarBox(toolbar=proxy, toolbar_location='above')] + \
                                    self.view.object.children[1:]

    @pm.depends('images', watch=True)
    def _update_figures(self):
        self._update_num_figures()
        self._update_source()
        self._set_image_dimensions()
        if len(self.view.object.children[1].children) < len(self.images):
            self._update_toolbar()
            self._update_grid()
        pass    # set a breakpoint here and run `show(self.view.object)`

    @pm.depends('ncols', watch=True)
    def _update_grid(self):
        r, c = np.unravel_index(np.arange(len(self.images)), (self.nrows, self.ncols))
        gridbox = self.view.object.children[1]
        gridbox.children = list(zip(self._active_figures, r, c))

    @property
    def nrows(self):
        return math.ceil(len(self.images) / self.ncols)

    @property
    def _active_figures(self):
        return self._figures[:len(self.images)]


class TestImageGrid(pm.Parameterized):
    start_index = pm.Integer(0)
    num_images = pm.Integer(1, bounds=(0, None))
    load_func = pm.Callable(lambda filename: cv2.imread(str(filename), cv2.IMREAD_UNCHANGED),
                            precedence=-1)
    folder = pm.Foldername(constant=True, precedence=-1)
    files = pm.List(class_=Path, constant=True, precedence=-1)
    imagegrid = pm.ClassSelector(class_=ImageGrid, constant=True, precedence=-1)

    def __init__(self, **kwargs):
        super().__init__(imagegrid=ImageGrid(), **kwargs)
        with pm.edit_constant(self):
            self.files = list(Path(self.folder).glob('*.png'))
        self.param.start_index.bounds = 0, len(self.files)
        self.view = pn.Row(
            pn.WidgetBox(pn.Param(self.param), pn.Param(self.imagegrid.param)),
            self.imagegrid.view
        )
        self._set_images()
        self._stepsize()

    @pm.depends('start_index', 'num_images', watch=True)
    def _set_images(self):
        last_index = self.start_index + self.num_images
        images = [self.load_func(file) for file in self.files[self.start_index:last_index]]
        self.imagegrid.images = images

    @pm.depends('num_images', watch=True)
    def _stepsize(self):
        self.param.start_index.step = self.num_images


def view():
    tig = TestImageGrid(folder="images/")
    return tig.view


if __name__.startswith("bokeh"):
    view().servable()
elif __name__ == '__main__':
    pn.serve(view, port=8920)

Bryan · July 16, 2021, 5:16pm

I don’t have a folder of images handy, and I have no idea what size or sort of images might make s difference to repro what you are seeing or not. Given that this is already 160 lines of code to look into, it is best if you make things as unambiguous as possible for anyone to look into. A GH repo with suitable images and a reproducible conda env file would be ideal.

tvdhauwe · July 16, 2021, 5:53pm

I fixed the main issue. Say I at some point have 3 images, meaning I have a ColumnDataSource with 3 rows and 3 figures that each have a CDSView with and IndexFilter whose indices attributes are [0], [1], [2] respectively. If I update the datasource to have 2 rows instead, I have to update the IndexFilter of the last figure to be [] or it breaks the model.

I had actually tried this before and it didn’t work. Turns out the order of operations is important. I have to update the IndexFilters before updating the datasource.

@Bryan, @MarcSkovMadsen, this is where you guys could clear up some confusion for me about potential differences between the Bokeh and Panel server. I’ve worked with both now and it seems like with Bokeh only any changes to the model are applied after the callback finishes. With Panel it looks like each modification is immediately sent to the browser. That would mean that if I’d been using Bokeh only the order of operations wouldn’t have mattered. Is this correct?

@Bryan, any thoughts on my other two questions?

In _update_toolbar(), I first tried updating the toolbars and tools attributes of the ProxyToolbar object. This didn’t activate the tools on any new figures that were added. However if I pass multiple images into the constructor, the tools do work for all those initial figures. I don’t really understand what’s going on there since both are using the same code. As you can see, I ended up creating a whole new ToolbarBox to make it work.

I have a callback for my max_width and max_height parameters, which is supposed to change the size of all the images in the grid. In its current state this doesn’t work, but admittedly I haven’t spent a lot of time looking into why. If anyone has any pointers that’d be helpful. Do I need to update something about the GridBox, or create a new gridplot, or …?

Bryan · July 16, 2021, 8:48pm

I can’t really clear up anything: re panel serve since I have very little direct experience with it. Bokeh callbacks do synchronize at the end of the function, which is why if you want something to happen “in the middle” then you currently need to utilized “next tick callbacks” and the like. At some point I’d like to make it possible to yield in the middle of callbacks to trigger a sync, but no idea when that might actually get done.

However, this:

That would mean that if I’d been using Bokeh only the order of operations wouldn’t have mattered. Is this correct?

Is too simplistic. While the sync events are all sent in one batch at the end of the callback, in the browser they can still only be reacted to individually in some specific order, since there is only on thread of execution. For many (most?) things this is not really an issue but in the case of data sources and views there are depedencies between them that probably imply race conditions depending on what that order of execution turns out to be. A real robust solution would probably necessitate encapsulating combined CDS+View updates into one logical event so that that sort of update can always be treated consistently together in a known working order.

That’s all I have time for just now, I will have to look at the others later.

tvdhauwe · July 16, 2021, 8:54pm

Thanks, this is already very helpful. I can kind of look at it as a sequence of modifications getting sent to the client, as opposed to all the modifications from the python code getting aggregated into a single update and then sent to the client.

Bryan · July 16, 2021, 9:12pm

I guess I’d also add, AFIAK panel serve is build on top of Bokeh so I am not sure how it would/could behave differently re: when things are synchronized. But it’s possible there is some difference I don’t know about. cc @Philipp_Rudiger

Philipp_Rudiger · July 16, 2021, 9:34pm

Will have to look at this in more detail for the other questions, but panel does have some logic that tries to dispatch events immediately if at all possible (i.e. whether the document lock can be safely bypassed).

Philipp_Rudiger · July 16, 2021, 9:38pm

I really want to work with Mateusz to not only allow batching events but also ensuring they are applied as a single batch on the JS end. Because right now it can be difficult to make certain changes, e.g. if you want to change the column name you have to coordinate the changes to the glyph and the CDS.

Bryan · July 16, 2021, 9:47pm

Yes this will come down to identifying operarations that need new dedicated events. That will be great for panel tho on the pure Bokeh side of things it will probably be a perennial issue to educate users about special cases that have dedicated API/events vs just trying to set the individual properties as they would now.

Philipp_Rudiger · July 16, 2021, 10:17pm

Should probably move this discussion elsewhere but I was really hoping to avoid that and just allow having batched events freeze all signals on the JS end until all model change events have been applied. Still requires some knowledge of when you have to batch but that would also allow for much for efficient updates since you won’t be triggering a bunch of redraws (and layout engine updates).

tvdhauwe · July 16, 2021, 10:32pm

I’ve noticed that some of my changes get sent twice. E.g. if I update the datasource, I see the images being sent in the browser as binary frames. But since I then also update the GridBox, I also see a huge JSON update which contains the image data again.

I just stumbled upon Document.hold() / .unhold(). Is this what @Philipp_Rudiger is referring to?

I’ve managed to solve all of my issues except for one. When I try to resize the images, I update the frame_width and frame_height for each figure (in _set_image_dimensions()). This doesn’t immediately update the images, but they do change size whenever any other update triggers an update of the GridBox (e.g. changing the number of columns or the images themselves).

See below for a few things I’ve tried. I guess what I’m looking for is a way to manually trigger a redraw in the client. The hacky way I can make it work is to toggle the spacing attribute of the gridbox between 0 and 1.

    @pm.depends('max_width', 'max_height', watch=True)
    def _update_image_size(self):
        self._set_image_dimensions()
        # This triggers a redraw because the attribute changes
        self._gridbox.spacing = 1 - self._gridbox.spacing
        # Neither of these trigger a redraw
        # self._gridbox.trigger('children', self._gridbox.children, self._gridbox.children)
        # self._update_grid()

Bryan · July 16, 2021, 10:44pm

There were some versions with bugs related over-serialization. It’s always advised to state version information in questions.

I just stumbled upon Document.hold() / .unhold(). Is this what @Philipp_Rudiger is referring to?

No, hold functions across multiple callback invocations (it’s more like throttling).

tvdhauwe · July 16, 2021, 10:48pm

Using 2.3.2. My comment about the multiple update was really just FYI in response to Philipp.

Is there any way to force a redraw of a GridBox without changing its children (the figures are still the same, but their frame_width and frame_height have been updated) or spacing attribute? Or really any attribute for that matter.