This is kinda piggybacking/an extension of this question CustomJS for selected indicies after region selection in multi_line model , and would probably largely address/resolve my motive behind this feature request: [FEATURE] MultiScatter · Issue #12367 · bokeh/bokeh · GitHub
What I tried to do was come up with a way of plotting the MultiLine coordinates as individual points on a Scatter WITHOUT duplicating the data (i.e. without making two CDS’s, one containing the list of list/unflattened data, and another containing the same thing but flattened.
My idea was to use a CustomJSTransform that will flatten the “list of lists” :
from bokeh.plotting import figure, show, save
from bokeh.transform import transform
from bokeh.models import MultiLine, ColumnDataSource, CustomJS, CustomJSTransform, CustomJSExpr
from bokeh.core.properties import expr
import numpy as np
x = [[1,2,3,4,5], [1,2,3,4,5]]
y = [[8,6,5,2,3], [3,2,5,6,8]]
s1 = ColumnDataSource(data=dict(x0=x, y0=y))
f = figure(tools='lasso_select')
r=f.multi_line(xs='x0',ys='y0',source=s1)
tr = CustomJSTransform(v_func='''
return xs.flat()
''')
rs = f.scatter(x=transform('x0',tr),y=transform('y0',tr)
,fill_color='red',source=s1)
show(f)
The result is that only the first two points of the Scatter are plotted → This seems to be because s1.data’s columns have a length of two… whereas the transformed/flattened data actually has a length of 10.
If I try a “variation” of this, where I create a separate CDS with identical column names to drive the flattened result, and pass the original “unflattened” CDS into the CustomJSTransform :
from bokeh.plotting import figure, show, save
from bokeh.transform import transform
from bokeh.models import MultiLine, ColumnDataSource, CustomJS, CustomJSTransform, CustomJSExpr
from bokeh.core.properties import expr
import numpy as np
x = [[1,2,3,4,5], [1,2,3,4,5]]
y = [[8,6,5,2,3], [3,2,5,6,8]]
s1 = ColumnDataSource(data=dict(x0=x, y0=y))
sf = ColumnDataSource(data=dict(x0=[],y0=[]))
f = figure(tools='lasso_select')
r=f.multi_line(xs='x0',ys='y0',source=s1)
xtr = CustomJSTransform(args=dict(s1=s1)
,v_func='''
return s1.data['x0'].flat()
''')
ytr = CustomJSTransform(args=dict(s1=s1)
,v_func='''
return s1.data['y0'].flat()
''')
rs = f.scatter(x=transform('x0',xtr),y=transform('y0',ytr)
,fill_color='red',source=sf)
show(f)
(Kinda saw this coming), I get no scatter points plotted, because really I’m just doing the same thing but with extra steps… except this time I’m relying on an empty/zero length CDS.
Finally, I found I can get the above setup to work but ONLY if I instantiate my “flat source” (sf) with data equal in length to the unflattened result. I can fill it initially with zeros for example:
l = len([item for sublist in x for item in sublist]) #gets the unflattened length
sf = ColumnDataSource(data=dict(x0=np.zeros(l),y0=np.zeros(l)))
But… in cases where I have tens of thousands of coordinates… I don’t want to write tens of thousands of 0s to the html output to provide a completely redundant initial state. Is there a key mechanic I’m missing or is this something I simply can’t do with the available built in tools right now?
I’ll also mention I found an example here → bokeh/customjs_expr.py at branch-2.4 · bokeh/bokeh · GitHub using CustomJSExpr that also seems like it might be “harnessable” to achieve this, with the significant difference from my above attempts being that it demonstrates how to build a custom DataModel as well… I’m wondering if a custom DataModel could be used to do what I’m after, and if so, some hints/guidance on that would be swell… Thanks!