Bug with figure.image in Bokeh 3.1.1

Aziuth · June 19, 2023, 1:02pm

I recently switched from Bokeh 2.4.3 to Bokeh 3.1.1 and found several of my plots not working anymore.
The plots in question used figure.image to create a pixel-based plot.

What I found out is that they now seem to have problems using raw python lists, and only work if I cast those to numpy arrays.
My code looked somewhat like this:

        data = []
        for <some iteration>
            <some computation>
            data.append([sub_data])

        source = ColumnDataSource()
        source.data['data'] = data

        plot = ...
        plot.image(source=source, image='data')

To make this work again, I had to change the line within the loop to

        data.append(np.array([sub_data]))

which to me looks like code smell.

As an alternative way to produce this bug, you can take the exemplary code given in image — Bokeh 3.1.1 Documentation, which naturally produces numpy arrays, and replace [d] with [d.tolist()] in the call to p.image.

Bryan · June 19, 2023, 1:54pm

This is not a bug, it was an explicit change made at the time Bokeh 3.0 was released. From the migration notes:

Image glyphs Image and ImageRGBA require 2d ndarray data for images, ragged “lists of lists” are no longer supported.

I can’t tell from your code snippet whether you are generating a list of 2d arrays for multiple images, or a list of 1d arrays that make up a single image. If it is the latter, then note that such usage “working” is accidental and unintentional and unsupported (should not be relied on). Individual single images are expected to be 2d numpy arrays (which is common for almost any python library that processes image data).

Aziuth · June 19, 2023, 2:44pm

That answers my question, okay, I can live with that.

Other than that, your guess was right - I am using several images within one. Basically I have three sets of data for which I create each a separate field, which are rendered within the same plot. In a later stage of the process, I assume that I will replace the three images with a single one, but the amount can vary, and the data comes from different objects, this is why I chose this approach.

I’d like to follow your advice and combine them, but the thing is that I have another part of my code that goes somewhat like this:

source.data['measure'] = ['<First Measure>', '<Second Measure>', '<Third Measure>']

TOOLTIPS = [
            ('Measure', '@measure'),
                ...
            ]

plot = figure(... tooltips=TOOLTIPS)

Like this, when you hover over each sub-image, you get told to what it belongs.
If I merged the sub-images, I’d have to create those as fields of strings, which to me feels quite wastefull. Or is there some other option that I am not aware of?

Bryan · June 20, 2023, 3:24am

@Aziuth I suspect we may not be using terminology in the same way, so I am still not sure if those are separate images (which the image glyph can handle perfectly well, as long as they are all 2d arrays) and if so, what actual format they are in. So I can’t really comment. I should have asked earlier, but it is always advised to provide concrete code in the form of a complete Minimal Reproducible Example since that helps to focus and clarify discussions immediately.

Aziuth · June 20, 2023, 9:33am

I mean, I can’t really give you the full picture without giving my actual code, which I can’t.
I could give you something faked, but in this context, this wouldn’t be that useful.

The basic thing is this: I do have a 3d array, which contains 2d images that are to be rendered in the same figure, stacked on top of each other.
Those come together with some other attributes, which practically form three sets of data. Originally everything was done with raw python lists, but given the change in my original post, the image data is now a raw list containing two-dimensional numpy arrays.

I use a big ColumnDataSource that contains everything to be rendered. Amongst other things, I have the fields

['data'] # the actual image data, an array containing three 2d-arrays
['measure'] # three strings that are basically the names of the three sub-images
['x']['y'] # the lowest coordinates for where the data is to be rendered, three items each
['dw']['dh'] # the sizes that the sub-imagesare to have, three items each

Those are rendered using

mapper = LinearColorMapper(palette=palette, low=0, high=size_accumulator)
plot.image(source=self.source, image='data', x='x', y='y', dw='dw', dh='dh', color_mapper=mapper)

and measure is used in the hover effect.

Given how Bokeh currently works, although as I understand you this being accidentally, this code results in there practically being three packages that happen to be rendered in the same plot.
If I programmed everything myself, I’d probably go for an orthogonal design, and have there be something like a class that stores the data of a single sub-image, together with the corresponding data, and then have an array of three objects of that class.
If you told me that I can do each sub-image separately, maybe in it’s own plot as an overlay, or in it’s own item in the plot (as possible with other items), that would also be fine for me.

Aziuth · June 20, 2023, 11:37am

I just found another thing related to this that doesn’t work like it used to anymore. Not sure whether this is a bug or whether I am using accidental side behaviour here, though.

So, I have a plot and I want to change it’s data in the background. And I want to change that data one by one, without the rendering being updated immediately after each single change, only once at the end.
Not causing updates immediately works quite well as long as arrays in the source are only changed and not assigned.
As an example, I am culling some arrays, and if I’d write something like

source.data['data'] = source.data['data'] [0:length]

this would trigger an update.
However, if I go with

source.data['data'] [:] = source.data['data'] [0:length]

it doesn’t, since it doesn’t assign the list as a whole.

This still works, so far so good.

However, at the end, I want to trigger an update on purpose. For that, I am assigning an array to itself, which in turn used to trigger the update:

source.data['data'] = source.data['data']

However, this doesn’t work anymore since I used the numpy classes. I don’t know why.
Luckily I do have some other parts of the source which use no numpy classes. For example, I have an array names which stores a raw list of string, and when I change the code to

source.data['names'] = source.data['names']

it triggers the update again.

Again, not sure whether this is a bug or whether I am using accidental side behaviour anyway.

That said, I’d really like there to be methods with which I could do those things directly without tricks. Like, say, figure::freeze, figure::unfreeze and figure::update.

Aziuth · June 20, 2023, 11:53am

Even more, in addition to my last comment: Some of the updates done in the background are, when the plot is refreshened, automatically taken over. Others aren’t. I found out that in order to change the y-coordinates of an image, I had to trigger this by calling source.data['y'] = source.data['y'] . This was not necessary in 2.4.3.

Just to make sure, I changed my update method to

    def _trigger_update(self) -> None:
        for name in self.source.data:
            self.source.data[name] = self.source.data[name]

which sadly already causes a “hickup”.

Bryan · June 20, 2023, 3:10pm

The previous behavior was a bug. For eventing purposes, self-assignment should always be a no-op. This bug was fixed at some point.

The advised best practice here is to always update an entire new dict separately, and then assign source.data “atomically” at the end.

new_data_dict = {}
# populate new_data_dict
source.data = new_data_dict

Updating individual columns is possible, and reasonable if you only need to update one column. But if you need to update multiple columns then updating them separately will trigger eventing on every separate update, which is usually not desirable.

That said, I’d really like there to be methods with which I could do those things directly without tricks. Like, say, figure::freeze, figure::unfreeze and figure::update.

The hold and unhold APIs were added in 0.12.10

https://github.com/bokeh/bokeh/blob/branch-3.3/examples/server/app/hold_app.py

Edit:

[‘data’] # the actual image data, an array containing three 2d-arrays

This is the expected format. Like all glyphs, image is “vectorized”. Usually that means a vector of numbers (e.g. coordinates for a scatter marker) but in this case it is a vector of 2d image arrays, one for each image to show.

system · September 18, 2023, 3:10pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.