Dynamically Updating Color Bar for Multiple Plots

aali · July 1, 2024, 5:28pm

I am working with Python version 3.11.7 and Bokeh version 3.3.4

I am creating a data visualization web app with Bokeh. My web app allows users to load in data from a couple different databases and view their selected data on multiple Bokeh plots. The data is loaded in as a pandas data frame. Each plot has its own independent ColumnDataSource which is a copy of this data frame. This means that each plot has access to the same exact data, but is entirely independent from every other plot. The goal of this web application is to provide users with an intuitive and versatile workflow for data analysis.

Each plot has an X and Y dropdown menu for users to select the data they want to see on each axis, as well as a dropdown menu for them to select a variable to color the plot by. I want to implement a dynamically changing color bar based on the selection made from the color dropdown menu. However, the best I can do right now is add a new color bar to the plot for each successive selection made from the dropdown menu. This causes the color bars to stack next to each other. My main task is that I need to find a way to remove the color bar from the plot entirely and re-add a new one each time the selection from the color dropdown menu changes. I can’t seem to find a way to do that. Since I have multiple plots that will each have their own color bar, I want to develop a generalizable callback function for the color bars; one that is scalable for multiple plots.

So far, I have tried to write an if statement within the callback for the color bar that manually checks to see if there is already a color bar present on the plot, and attempts to remove it if so. I’ve also tried to dynamically change the color map of the color bar rather than redrawing the whole color bar itself, but that didn’t work either, and resulted in the color bar not showing up at all. Also, since this web app that I am developing has several different plots with the same functionality, I typically will write one generalizable callback function for a feature and then use Python’s “partial” function to attach the callback to each individual widget for each plot.

I’ve stripped down my web app to the simple minimum working example that I’ve included in this post. This code shows the problem that I’m explaining of the color bars stacking next to each other. There is one plot that has three dropdown menus that users can interact with. The update_color() function was my attempt at creating a callback function for the color dropdown menu. Does anyone have suggestions about how I can edit my code to achieve the desired functionality?

Generally speaking, I am pretty uninformed about how to dynamically change plot attributes like color bars or axes ranges based on user selections and inputs. I’d appreciate any insight into this!

Note that I typically run my code as a Bokeh server using the bokeh serve --show path/filename.py command.

Here is the minimum working example:

# Imports
from bokeh.io import curdoc
from bokeh.layouts import column, row, Spacer
from bokeh.models import CheckboxGroup, RadioButtonGroup, MultiSelect, PointDrawTool, DataTable, TableColumn, LinearColorMapper, LogColorMapper, ColorBar, Button, TextInput, ColumnDataSource, Select, Div, CustomJS, CDSView, BooleanFilter, IndexFilter, Tooltip, HoverTool, CustomJSTransform, CustomJSFilter
from bokeh.plotting import figure
from bokeh.models.layouts import Tabs, TabPanel
from bokeh.transform import transform, factor_cmap, linear_cmap
from bokeh.palettes import Paired
from functools import partial
import pandas as pd
import numpy as np
import scipy as sp
from datetime import datetime, timedelta
import random
  
# Initializing sample data
num_rows = 1000
num_cols = 10
start_date = datetime(2020,1,1)
end_date = datetime(2023,1,1)
date_range = [start_date + timedelta(days=random.randint(0, (end_date - start_date).days)) for _ in range(num_rows)]
columns = [f'Feature_{i}' for i in range(1,num_cols + 1)]
data = np.random.randn(num_rows, num_cols)
df = pd.DataFrame(data, index = date_range, columns = columns)
  
# Creating example plot
options = list(df.columns)
TOOLS = 'pan, wheel_zoom, reset, hover, poly_select, box_select, lasso_select, box_edit, save'
source = ColumnDataSource(df)
x_select = Select(title="X Variable", options=['Select'] + options, value='Select')
y_select = Select(title="Y Variable", options=['Select'] + options, value='Select')
color_select = Select(title = "Color By", options = ['Select'] + options, value = 'Select')
p = figure(title = 'Plot 1', width = 800, height = 700, x_axis_label = 'X', y_axis_label = 'Y', tools = TOOLS)
r = p.circle(x = 'X', y = 'Y', source = source, selection_color = 'red', nonselection_color = 'blue')
  
  
# Callback to update plot based on dropdown selections
cbxy = CustomJS(args = dict(r = r, y_select = y_select, x_select = x_select, xaxis = p.xaxis, yaxis = p.yaxis), code = """
                //clearing existing data from the data source
                r.data = {x:[], y:[]};
                
                //get the values of both selects
                const yf = y_select.value;
                const xf = x_select.value;
                
                //tell the glyph which fields the source should refer to
                r.glyph.y.field = yf;
                r.glyph.x.field = xf;
                
                //change axis labels accordingly
                yaxis[0].axis_label = yf;
                xaxis[0].axis_label = xf;
                
                //manually trigger change event to re-render
                r.glyph.change.emit()
                """)
                
# Callback to update color bar based on dropdown selection
def update_color(attr, old, new):
    x = x_select.value
    y = y_select.value
    color_by = color_select.value
  
    if color_by == 'None':
        p.scatter(x=x, y=y, source=source)
        # Remove the color bar if it exists
        for layout in p.layout:
            if isinstance(layout, ColorBar):
                p.remove_layout(layout)
    else:
        mapper = linear_cmap(field_name=color_by, palette=Paired[10], low=min(
            df[color_by]), high=max(df[color_by]))
        p.scatter(x=x, y=y, source=source, color=mapper)
        color_bar = ColorBar(
            color_mapper=mapper['transform'], width=8, location=(0, 0))
        # Following line of code is causing colorbars to stack
        p.add_layout(color_bar, 'right')
  
# Assigning callbacks to widgets
colorcb1 = partial(update_color, source, color_select, p)
x_select.js_on_change('value', cbxy)
y_select.js_on_change('value', cbxy)
color_select.on_change('value', update_color)
  
# Formatting
wb1 = row(column(x_select, y_select, color_select), p)
layout = column(wb1)
curdoc().add_root(layout)

nmasnadi · July 2, 2024, 12:41am

I think this is what you want if I understood the problem correctly:

from bokeh.layouts import column, row, Spacer
from bokeh.models import CheckboxGroup, RadioButtonGroup, MultiSelect, PointDrawTool, DataTable, TableColumn, LinearColorMapper, LogColorMapper, ColorBar, Button, TextInput, ColumnDataSource, Select, Div, CustomJS, CDSView, BooleanFilter, IndexFilter, Tooltip, HoverTool, CustomJSTransform, CustomJSFilter
from bokeh.plotting import figure
from bokeh.models.layouts import Tabs, TabPanel
from bokeh.transform import transform, factor_cmap, linear_cmap
from bokeh.palettes import Paired
from functools import partial
import pandas as pd
import numpy as np
import scipy as sp
from datetime import datetime, timedelta
import random
  
# Initializing sample data
num_rows = 1000
num_cols = 10
start_date = datetime(2020,1,1)
end_date = datetime(2023,1,1)
date_range = [start_date + timedelta(days=random.randint(0, (end_date - start_date).days)) for _ in range(num_rows)]
columns = [f'Feature_{i}' for i in range(1,num_cols + 1)]
data = np.random.randn(num_rows, num_cols)
df = pd.DataFrame(data, index = date_range, columns = columns)
  
# Creating example plot
options = list(df.columns)
TOOLS = 'pan, wheel_zoom, reset, hover, poly_select, box_select, lasso_select, box_edit, save'

df['X'] = df['Feature_1']
df['Y'] = df['Feature_2']
df['Color'] = df['Feature_3']
source = ColumnDataSource(df)
x_select = Select(title="X Variable", options=options, value='Feature_1')
y_select = Select(title="Y Variable", options=options, value='Feature_2')
color_select = Select(title = "Color By", options = options, value = 'Feature_3')
p = figure(title = 'Plot 1', width = 800, height = 700, x_axis_label = 'X', y_axis_label = 'Y', tools = TOOLS)
r = p.circle(
    x = 'X', y = 'Y', source = source,
    color=linear_cmap(
        'Color', 'TolYlOrBr9', df['Color'].min(), df['Color'].max()
    ),
    line_color=None,
    size=8,
)
color_bar = ColorBar(color_mapper=r.glyph.fill_color['transform'], label_standoff=12)
color_bar.location = 'bottom_right'
p.add_layout(color_bar, 'right')

js_code = """
const data = source.data;
const color_column = color_select.value
const color_values = data[color_column]
r.glyph.fill_color.field = color_column
r.glyph.fill_color.transform.high = Math.max(...color_values)
r.glyph.fill_color.transform.low =  Math.min(...color_values)
data['X'] = data[x_select.value]
data['Y'] = data[y_select.value]
source.change.emit();
"""

cjs = CustomJS(
    args={
        'source': source,
        'r': r,
        'x_select': x_select,
        'y_select': y_select,
        'color_select': color_select
    },
    code=js_code
)

# Assigning callbacks to widgets
x_select.js_on_change('value', cjs)
y_select.js_on_change('value', cjs)
color_select.js_on_change('value', cjs)
  
# Formatting
wb1 = row(column(x_select, y_select, color_select), p)
layout = column(wb1)
show(layout)

all the callback code is in js so it’s a standalone document but you can implement the same thing in python callbacks too if you prefer. Here is how it looks for me:

Screen Recording 2024-07-01 at 8.36.44 PM (1)

aali · July 2, 2024, 5:08pm

Thanks for taking the time to help out!

I have a question regarding your code. You wrote the line:

df['Color'] = df['Feature_3']

I can see that you created a new column within the data frame called ‘Color’, but why did you create it as a copy of the ‘Feature_3’ column? Later on in the code, while initializing the renderer, you wrote this:

r = p.circle(
    x = 'X', y = 'Y', source = source,
    color=linear_cmap(
        'Color', 'TolYlOrBr9', df['Color'].min(), df['Color'].max()
    ),
    line_color=None,
    size=8,
)

Since the data frame’s ‘Color’ column is just a copy of the ‘Feature_3’ column, I’m not understanding why it makes sense to bound the color bar according to the minimum value and maximum value of the ‘Color’ column as you did here. These minimum and maximum values are just arbitrary values aren’t they? What if I wanted to color the plot by a variable that had a wider or smaller range than the min and max values hard coded here?

I’m generally just confused about why you created the ‘Color’ column in the data frame, and how it is useful.

nmasnadi · July 2, 2024, 6:53pm

So basically the main difference between my code and your original code is that yours starts with an empty plot but mine starts with a plot of feature 1 vs. feature 2 with feature 3 as color. Then the user can change any of them using the select widgets and the plot will update (including the min/max for the color bar) and this is all done in the Javascript callback. So it is true that I’m using Feature_3 as the color and it seems like it’s hard coded there, but the column used for color and the min/max for the color bar are actually updated dynamically when the options change in the select widgets. These three lines in the JS code take care of updating the column used for the color and the min/max of color bar:

r.glyph.fill_color.field = color_column
r.glyph.fill_color.transform.high = Math.max(...color_values)
r.glyph.fill_color.transform.low =  Math.min(...color_values)

We can also get rid of the Color column and just use Feature_3 (or any other column, as long as we use the same one for the value in color_select) like this:

r = p.circle(
    x = 'X', y = 'Y', source = source,
    color=linear_cmap(
        'Feature_3', 'TolYlOrBr9', df['Feature_3'].min(), df['Feature_3'].max()
    ),
    line_color=None,
    size=8,
)

I’m not sure if this is the best solution, but it is what I came up with a while ago when I was trying to solve a similar problem, and it works!

system · September 30, 2024, 6:53pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.