Updating Figure Legend Using Data Source

Hello everyone, first of all, please go easy on me, I am new in Python and here.

I am trying to create a engineering tool with a simple layout. Everything works out except some formatting and legend labels.

I understand that data source columns are essential for interactivity between bokeh serve and code. If I don’t use datasourcecolumns, the updates on the browser does not reflect back to code. So, I am creating all my figures with datasourcecolumns. However, I have a problem with labels.

In my datasourcecolumns, I have n array. The first array is x_vector which is basically an array between xmax and xmin. With the same size, I have y_vectors. All y_vectors have the same size with x_vector, as it should be.

So, I line plot x_vector and y_vector1 first. Then, x_vector and y_vector2. Then, next until all n y_vector is plotted.

Question is how can I label the y_vectors in an interactive way? Assume that n=6, so I have 6 y_vector, 1 x_vector, all have (assume) 1000 element. But, my label vector is only 6 element long since I have only 6 y_vector! So, I cannot use the same source for both my x and y elements and labels.

I have tried to use lists as keys in dictionary, but I couldn’t manage to work it out. I thought giving x and y as a list of vectors would work but, it didn’t because y=‘b’[0] is not valid.

from bokeh.io import show
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
import numpy as np
import matplotlib.pyplot as plt

a = np.arange(0,100,1)
b1 = a*2
b2 = a*3
labels = ["b1","b2"]

sourcedict = {"a": [a], "b": [b1,b2], "c": labels}

source = ColumnDataSource(sourcedict)

fig1 = figure()

# fig1.line(x='a', y='b', legend_label = 'c', source=source)
#BokehUserWarning: ColumnDataSource's columns must be of the same length. Current lengths: ('a', 1), ('b', 2), ('c', 2)
fig1_alt = figure()

fig1_alt.line(x='a'[0], y='b'[0], legend_label = 'c'[0])

Hi @bde,

If I understand your question correctly, I think you may want a grouped legend. Take a look at this example in the User Guide, and see if that will meet your needs.

Dear @carolyn,
Thank you for your answer. I am not sure if grouped legend may solve my problem. But, for you to investigate easily, I have created a code. To check the code you should: Assume that number of y vector depends on the user input in bokeh server. When user change the y_number, the figure and the labels should be updated, but I couldn’t even get there.

from bokeh.io import output_file, show
import numpy as np
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource

dictxy = {}
dictxy["x_vector"] = np.arange(0,10,1)

y_number = 4 #Will be interactively decided by user! 
for i in range(1,y_number+1):
    dictxy["y_" + str(i)] = dictxy["x_vector"] * (i+1)


label = ['Times' + str(n) for n in range(2,y_number+2)]

CDSxy = ColumnDataSource (dictxy)

p = figure()

for i in range(1,y_number+1):
    p.line(x='x_vector', y='y_'+str(i), source = CDSxy, legend_label = label[i-1])
    
show(p)

I was trying to prepare a good code for you to see my problem, but I couldn’t even get to update my figure! I am not sure if I am capable of going forward.

from bokeh.io import output_file, show, curdoc
import numpy as np
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource, Slider
from bokeh.layouts import layout, column

dictxy = {}
dictxy["x_vector"] = np.arange(0,10,1)

y_number = 4 #Will be interactively decided by user! 
def create_model():
    for i in range(1,y_number+1):
        dictxy["y_" + str(i)] = dictxy["x_vector"] * (i+1)
    
    print (y_number)
    label = ['Times' + str(n) for n in range(2,y_number+2)]
    
    CDSxy = ColumnDataSource (dictxy)
    return CDSxy, label

CDSxy, label = create_model()

    
p = figure()

for i in range(1,y_number+1):
    p.line(x='x_vector', y='y_'+str(i), source = CDSxy, legend_label = label[i-1])
y_number_slider = Slider(title="Y_number", value=1, start=0, end=20, step=1)

c = column(y_number_slider, p)


def update_model(attrname, old, new):
    y_number = int(y_number_slider.value)
    globals().update(locals())
    create_model()


y_number_slider.on_change('value', update_model )

curdoc().title = "Deneme"
curdoc().add_root(c)

It sounds like what you want to do is

  1. dynamically add/remove columns from the CDS, and
  2. have those additions reflected in the glyphs and the legend.

Similar questions have been asked before on the Discourse, and there’s not a perfectly straightforward way to do this. Re adding/removing columns, I think the most relevant post is here; for the legend, a possible approach is given here.

The basic ideas from these posts are:

  1. Rather than adding/removing columns from your CDS, a better approach is to have the data and glyphs always exist, but toggle the visible attribute to be true/false based on the user’s input.
  2. There is not currently a great way to update the legend as you describe. The Legend will show all glyphs, whether they are visible or not. So, explicitly create a Legend with only the subset of glyphs you want to see on every update.

Here’s an example worked up from your code, to be run as a bokeh server. This all assumes that the maximum possible number of y-vectors your user can select is something manageable (you proposed 6, I arbitrarily chose 15 for my example). If the number of y-values is very high, like 1000, then this might not be a great solution at scale. But see what you think.

from bokeh.io import curdoc
import numpy as np
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource, Slider, Legend, LegendItem
from bokeh.layouts import column


dictxy = {"x_vector": np.arange(0, 10, 1)}

y_max = 15  # I arbitrarily chose 15 as the max number of y arrays

# create arrays y_1 ... y_n, which are scaled versions of the x array
for i in range(1, y_max + 1):
    dictxy["y_" + str(i)] = dictxy["x_vector"] * (i + 1)

label = ['Times' + str(n) for n in range(2, y_max + 2)]
CDSxy = ColumnDataSource(dictxy)

# slider for user interactivity. Arbitrarily setting default value to 5.
y_slider = Slider(start=1, end=y_max, value=5, step=1, title="y value")


# This is the bokeh server callback that will be run every time the user changes the slider value.
def update_plot(attr, old, new):
    # clear out existing legend and start over; we will entirely redraw
    p.legend.items = []
    for i in range(1, y_max):
        # if line number is less than new value from the slider, we should see it and it should be in the legend.
        # if not, set to invisible, and do not add to legend.
        if i <= new:
            lines[i-1].visible = True
            p.legend.items.append(LegendItem(label=label[i-1], renderers=[lines[i-1]]))
        else:
            lines[i-1].visible = False


y_slider.on_change("value", update_plot)

p = figure()

# Add all our line renderers. We'll keep them in a list so we can easily access them later.
lines = []
for i in range(1, y_max + 1):
    if i <= y_slider.value:
        visible = True
    else:
        visible = False
    lines.append(p.line(x='x_vector', y='y_' + str(i), source=CDSxy, visible=visible))

# Every line we just drew would go into the figure's Legend by default. We want only visible lines in the Legend.
# Start with an empty legend and add only what we want:
custom_legend = Legend()
for i in range(1, y_max + 1):
    if i <= y_slider.value:
        custom_legend.items.append(LegendItem(label=label[i-1], renderers=[lines[i-1]]))
p.add_layout(custom_legend)

col = column(y_slider, p)

doc = curdoc()
doc.add_root(col)

The behaviour is exactly what I wanted. Thank you very much for such detailed answer! I am now working on every single step you have provided so that I can understand better.

One thing: In your code, you have calculated the y values at the beginning of the project and saved them. Creating the legend at the beginning is not a problem for me and I can limit my y_max in my project. But since my y_values also depend to the user input in my project (for example x * a where a is given by user) is it good approach to use function to callback y calculation as I did in my post? Or is there a simpler way?

Thank you very much, again and again.

I think you’d want to add a second callback for that other user input. Something like:

a_slider = Slider(start=2, end=5, value=2, step=1, title="starting multiplier")


def update_multiplier(attr, old, new):
    for i in range(1, y_max+1):
        CDSxy.data["y_" + str(i)] = CDSxy.data['x_vector'] * (new+i)


a_slider.on_change("value", update_multiplier)

The problem I see with your function as written above is that it looks like you’re trying to replace the CDS with an entirely new CDS, which won’t trigger a change to your glyphs. The glyphs are listening for changes to the data in the existing CDS, so that’s what you’d want to update.

I’m sure others could explain this better than I can, but that’s the basic idea: replace the data in your existing structure, instead of replacing the whole structure, and your glyphs will update accordingly.

How can it be done with CustomJS?