Accessing data in CDS using CDSview

Hi all,

For my bokeh app, I need to do some processing on the subset of the CDS data using CDSview. I can’t figure out how to do this, i.e. access the ndarrays that are stored in CDS using the filters from CDSview.

Of course I could reverse engineer it, and write a helper function with boolean masking, but surely this is available somewhere in the API?

so for example with

from bokeh.models import ColumnDataSource, CDSView, GroupFilter
from bokeh.sampledata.iris import flowers

source = ColumnDataSource(flowers)
view1 = CDSView(source=source, filters=[GroupFilter(column_name='species', group='versicolor')])

I want to do something like:

idx = (source.data[view1.filters[0].column_name] == view1.filters[0].group)
filtered = {k: v[idx] for k, v in source.data.items()}

but then using the API, and also handling multiple filters etc.

thx in advance,
Daniel

There is not, in fact, and the main reason is simply that there is no assumption that CDS columns must be NumPy arrays—they could also just be plain lists or tuples, which do not support fancy indexing.

@Bryan
Thanks for clarifying. Aside from fancy indexing, my understanding is that somewhere in the code base the filters in CDSview should be processed to produce the right figures.

Or is this done on the Javascript side with compute_indices and hence with no method on the Python side?

@dkapitan That’s exactly correct, all the actual work of CDSView (and all the work of almost all Bokeh models, in fact) is done on the BokehJS side.

@Bryan
Great, thanks again for clarifying :+1:. Will solve my use-case some other way, then.

And I guess I would add: It’s not necessarily the case that we can’t add features that only apply in some cases or scenarios, we just have to be careful and clear with the API and docs when it can be expected to work, and when not. If you gain some experience making your own helper functions and then want to propose some addition to Bokeh based on that, please do make a GitHub issue.