How do I make this less hacky? (working code, trying to update "chosen" points in a scatter plot based on interactions in other plots)

sg-s · February 16, 2022, 6:34pm

I want to build a little thing that allows you to visualize different plots by either scrubbing through a dataset, or by hovering over points in scatter plots. I have a working example here:

import numpy as np
import bokeh
from bokeh.models import ColumnDataSource, CustomJS, Slider
from bokeh.plotting import figure, show, output_notebook
from bokeh.models import HoverTool
from bokeh.layouts import layout

output_notebook()

# make some fake data
n = 100
n_time_pts = 1000
time = np.linspace(0,1,n_time_pts)
all_traces = [np.random.random(n_time_pts) + x for x in np.arange(n)]

x = [np.mean(x) for x in all_traces]
y = [np.std(x) for x in all_traces]
z = [np.std(x)/np.mean(x) for x in all_traces]

source = ColumnDataSource(data=dict(time=time, y=all_traces[0]))
scatter_source = ColumnDataSource(data=dict(x=x, y=y,z=z))
marker_source = ColumnDataSource(data=dict(x=[1], y=[1], z=[1]))

slider = Slider(start=0, end=n, step=1, value=0, max_height=30, max_width=500)

# make figure panels
trace_fig = figure(width=1000, height=500, tools=[])
scatter1 = figure(sizing_mode="stretch_width", 
                 height=500, 
                 max_width = 500,
                 tools=[])
scatter2 = figure(sizing_mode="stretch_width", 
                 height=500, 
                 max_width = 500,
                tools=[])

# make colors
colors = list(bokeh.palettes.inferno(n))
rejected_color = "#969696"
for i in np.arange(10):
    colors[i] = rejected_color
colors = tuple(colors)

# make plots in figures
trace_fig.line('time','y',source=source)

scatter1.circle(x,y, size=10, color=colors, alpha=0.8, hover_alpha=0)
marker1 = scatter1.circle('x','y',size=20,fill_color = None, color="red", source=marker_source)

scatter2.circle(x,z, size=10, color=colors, alpha=0.8)
marker2 = scatter2.circle('x','z',size=20,fill_color = None, color="red", source=marker_source)

# add a hover tool that sets the link data for a hovered circle
code = """
    if (cb_data.index.indices.length > 0) {
        if (cb_data.index.indices[0] > 0) { // dirty hack to ignore the zero index of the marker
            slider.value = (cb_data.index.indices[0])
            }
    };

"""

callback = CustomJS(args=dict(slider=slider), code=code)


scatter1.add_tools(HoverTool(tooltips=[],callback=callback))
scatter2.add_tools(HoverTool(tooltips=[],callback=callback))

# slider callback
callback = CustomJS(args=dict(source=source, 
                              all_traces=all_traces, 
                              slider=slider,
                             marker_source=marker_source,
                             scatter_source=scatter_source),
                    code="""
                        source.data.y = all_traces[slider.value];
                        source.change.emit();
                        marker_source.data.x[0] = scatter_source.data.x[slider.value];
                        marker_source.data.y[0] = scatter_source.data.y[slider.value];
                        marker_source.data.z[0] = scatter_source.data.z[slider.value];
                        marker_source.change.emit();
                    """)

slider.js_on_change('value', callback)


show(layout([
        [slider],
        [trace_fig],
        [scatter1, scatter2],
       ]))

but it’s pretty hacky. specifically:

the way I update the “hovered” point in the two scatter plots is by using a different scatter, and ignoring callbacks with index 0.
the tooltips are buggy too, and render as small white boxes.

any advice to improve this would be welcome! thank you!

gmerritt123 · February 17, 2022, 2:57pm

I like what you’ve managed to hack out here, some nice daisy-chaining of callbacks getting triggered on hover events. My approach to this kind of setup would have been:

One ColumnDataSource to rule them all. This would make the most out of this → Providing data — Bokeh 2.4.2 Documentation feature of bokeh. If renderers share the same datasource, they are intrinsically and “naturally” linked together.

So that’s easy to do, and you essentially already are doing that with scatter_source → scatter1 contains one renderer using it and scatter2 contains another renderer using it. But how to handle the line data (multiple x vals/yvals associated a single point location?

This is absolutely where the MultiLine glyph shines. It takes an array of arrays for x and y arguments, so you can build your source up like this:

data = dict(x=x,y=y,z=z,time=[time for t in all_traces], all_traces=all_traces])
src = ColumnDataSource(data)

data here will look like this:

x	y	z	time	all_traces
1	2	3	[0,1,2…999]	[3,4,6,1…]
2	1	2	[0,1,2…999]	[7,8,1, 5…]

Then create a multiline glyph/rendererer pointing to time and all_traces as the xs and ys args.

But wait won’t that plot 100 lines all at once on my plot?

Yes. But that’s where IndexFilter or CustomJS filter come in. See Providing data — Bokeh 2.4.2 Documentation

You create a filter that filters the data for only the index of the current slider value, create a CDSView w that filter and add that CDS view to your multiline renderer.

Your approach totally works, but that’s how I would have gone after this. Life is so much easier when you have just one CDS to worry about

sg-s · February 17, 2022, 3:44pm

thank you so much @gmerritt123! i figured there was some better way to do it, and i see it now.

system · May 18, 2022, 3:44pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.