Change y_range when hiding and showing lines

duarbrun · July 21, 2023, 1:20pm

Hello everyone,

I’m trying to make a chart which number of lines is defined by the number of different cities in a pandas.dataframe. I want to hide and show some of the lines (maybe with a multiselect control) and have the y_range to be updated in order to fit the visible lines data. I’m able to control the hide/show part, but not the adjustment of y_range.

Here’s a dataframe with the same structure of the one I’m dealing:

#%% Creating dataset
data = {
        'city':['City-1','City-2','City-3','City-4','City-1','City-2','City-3',
                'City-4','City-1','City-2','City-3','City-4','City-1','City-2',
                'City-3','City-4','City-1','City-2','City-3','City-4','City-1',
                'City-2','City-3','City-4','City-1','City-2','City-3','City-4'],
        'value':['440770.43','48259.34','14112.54','59208.12','405397.05',
                 '50299.98','19374.11','73865.03','401559.71','49777.54',
                 '19906.50','60450.23','414458.60','50161.29','16739.60',
                 '61169.98','411423.85','50990.14','16025.36','63231.71',
                 '401162.64','51719.12','20457.94','62856.73','502449.53',
                 '66137.81','22318.40','87541.79'],
        'date':['2022-06-01','2022-06-01','2022-06-01','2022-06-01','2022-07-01',
                '2022-07-01','2022-07-01','2022-07-01','2022-08-01','2022-08-01',
                '2022-08-01','2022-08-01','2022-09-01','2022-09-01','2022-09-01',
                '2022-09-01','2022-10-01','2022-10-01','2022-10-01','2022-10-01',
                '2022-11-01','2022-11-01','2022-11-01','2022-11-01','2022-12-01',
                '2022-12-01','2022-12-01','2022-12-01']
        }
df = pd.DataFrame(data)
df['date'] = pd.to_datetime(df['date'])
df['value'] = pd.to_numeric(df['value'])

I tried a few things:

The first attemptive was using MultiSelect control, with a CustomJS callback. I tried to use the control to hide and show the lines, and this works. The problem is with the plot.y_range that doesn’t update according to the visible data. At the end of the CustomJS callback, the plot.y_range.start and plot.y_range.end are updated, but when the callback is executed again, the y_range has the value corresponding to the entire source.

#%% Ploting using multiselect and CustomJS
multiselect = MultiSelect(title="Cities:", value=list(df['city'].unique()),
                          options=list(df['city'].unique()))

plot = figure(width=400, height=250, title='Total monthly value - by city',
              x_axis_type='datetime')
plot.yaxis[0].formatter = NumeralTickFormatter(format="R$0,00")

colors = ['red', 'blue', 'green', 'purple', 'orange']

lines = []
for i, city in enumerate(df['city'].unique()):
    df_city = df[df['city'] == city]
    color = colors[i % len(colors)]  # Loop through the colors list cyclically
    line = plot.line(x='date', y='value', color=color, source=df_city, legend_label=city, line_width=2)
    lines.append(line)
    
plot.legend.location = 'top_left'
plot.legend.title = 'Cities'

source = ColumnDataSource(df)

# Update the plot's data source when the MultiSelect is interacted with
multiselect.js_on_change('value', CustomJS(args=dict(plot=plot, lines=lines, source=source, multiselect=multiselect), code="""
    const selected_cities = multiselect.value;
    const data = source.data;

    // Check if the original_data attribute exists; if not, set it to the current data
    if (!this.original_data) {
        this.original_data = {
            date: [...data['date']],
            value: [...data['value']],
            city: [...data['city']]
        };
    }

    // Filter the DataFrame on the server side using the selected cities
    const filtered_data = { date: [], value: [], city: [] };
    for (let i = 0; i < this.original_data['city'].length; i++) {
        if (selected_cities.includes(this.original_data['city'][i])) {
            filtered_data.date.push(this.original_data['date'][i]);
            filtered_data.value.push(this.original_data['value'][i]);
            filtered_data.city.push(this.original_data['city'][i]);
        }
    }

    if (filtered_data.value.length > 0) {
        // Calculate the new y_range based on the filtered data for visible cities
        const visible_data = filtered_data.value.filter((val, index) => selected_cities.includes(filtered_data.city[index]));
        const min_value = Math.min(...visible_data) - 5;
        const max_value = Math.max(...visible_data) + 5;
        plot.y_range.start = min_value;
        plot.y_range.end = max_value;

        // Update the plot data source with the filtered data
        source.data = filtered_data;
    } else {
        // Reset y_range to its original values if there is no visible data
        const min_value = Math.min(...this.original_data.value);
        const max_value = Math.max(...this.original_data.value);
        plot.y_range.start = min_value;
        plot.y_range.end = max_value;;
    }

    for (let i = 0; i < lines.length; i++) {
        const city = lines[i].data_source.data['city'][0];
        const visible = selected_cities.includes(city);
        lines[i].visible = visible;
    }

    plot.change.emit();
"""))

show(column(multiselect, plot))

A curious thing happens when I add only one line to the plot, as in the code below. Instead of the for loop, I added a single line to the lines list, while everything else remained the same (including the CustomJS callback). When I select others cities (while still selecting City-4) the plot.y_range is updated, but I cannot make it work when adding the other lines to the plot.

#%% Ploting only one line
multiselect = MultiSelect(title="Cities:", value=list(df['city'].unique()),
                          options=list(df['city'].unique()))

plot = figure(width=400, height=250, title='Total monthly value - by city',
              x_axis_type='datetime')
plot.yaxis[0].formatter = NumeralTickFormatter(format="R$0,00")

colors = ['red', 'blue', 'green', 'purple', 'orange']

lines = []
lines.append(plot.line(x='date', y='value', color='blue', source=df[df['city'] == 'City-4'], legend_label='City-4', line_width=2))

plot.legend.location = 'top_left'
plot.legend.title = 'Cities'

source = ColumnDataSource(df)

# Update the plot's data source when the MultiSelect is interacted with
multiselect.js_on_change('value', CustomJS(args=dict(plot=plot, lines=lines, source=source, multiselect=multiselect), code="""
    const selected_cities = multiselect.value;
    const data = source.data;

    // Check if the original_data attribute exists; if not, set it to the current data
    if (!this.original_data) {
        this.original_data = {
            date: [...data['date']],
            value: [...data['value']],
            city: [...data['city']]
        };
    }

    // Filter the DataFrame on the server side using the selected cities
    const filtered_data = { date: [], value: [], city: [] };
    for (let i = 0; i < this.original_data['city'].length; i++) {
        if (selected_cities.includes(this.original_data['city'][i])) {
            filtered_data.date.push(this.original_data['date'][i]);
            filtered_data.value.push(this.original_data['value'][i]);
            filtered_data.city.push(this.original_data['city'][i]);
        }
    }

    if (filtered_data.value.length > 0) {
        // Calculate the new y_range based on the filtered data for visible cities
        const visible_data = filtered_data.value.filter((val, index) => selected_cities.includes(filtered_data.city[index]));
        const min_value = Math.min(...visible_data) - 5;
        const max_value = Math.max(...visible_data) + 5;
        plot.y_range.start = min_value;
        plot.y_range.end = max_value;

        // Update the plot data source with the filtered data
        source.data = filtered_data;
    } else {
        // Reset y_range to its original values if there is no visible data
        const min_value = Math.min(...this.original_data.value);
        const max_value = Math.max(...this.original_data.value);
        plot.y_range.start = min_value;
        plot.y_range.end = max_value;;
    }

    for (let i = 0; i < lines.length; i++) {
        const city = lines[i].data_source.data['city'][0];
        const visible = selected_cities.includes(city);
        lines[i].visible = visible;
    }

    plot.change.emit();
"""))

show(column(multiselect, plot))

The third attemptive was using the figure.legend.click_policy (as in the code below), to hide the lines. Although hide and show works just fine, I cannot think in a way to make it adjust y_range of the plot.

from bokeh.palettes import Spectral10

nplot = figure(width=400, height=250,title='Valor mensal total - por cidade', 
               x_axis_type="datetime")
nplot.yaxis[0].formatter = NumeralTickFormatter(format="$0.00", language="pt-br")

lines = []
for cidade, color in zip(df['cidade'].unique(), Spectral10):
    df_city = df[df['cidade'] == cidade]
    line = nplot.line(x='data', y='valor', color=color, source=df_city, 
                      legend_label=cidade, line_width=2)
    lines.append(line)

nplot.legend.location = "top_left"
nplot.legend.click_policy="hide"
show(nplot)

I’m running the code with Spyder (Anaconda3) on my local machine. From my research I’ve came to the conclusion that there isn’t a standard way to accomplish what I need to do. Does anyone knows anything about a workaround of doing it?

Thanks in advance.

Ps.: English is not my first language, so I’m sorry if anything could not be completely understanded.

Bryan · July 21, 2023, 3:38pm

@duarbrun I think you there may be much simpler solutions to your need, let’s look at those first.

Are you familiar with the interactive legend feature? That would allow users to hide glyphs by clicking on a legend, and you would not need any CustomJS or MultiSelect at all.

Alternatively, if you do still want to use a MultiSelect then the common approach is to have a CustomJS that toggles the .visible property of the relevant glyph renderers. That is much simpler than all the complicated CDS data re-arranging. There is an example in the repo:

https://github.com/bokeh/bokeh/blob/branch-3.3/examples/plotting/line_on_off.py

With either of these two approaches, the key is that you will want to use the standard default data-ranges, rather than setting the range extents yourself. Additionally, you’ll also need to set range.only_visible = True as well, so that non-visible glyphs do not contribute to the auto-ranging.

jcarson · July 21, 2023, 3:45pm

There is something that happens under the covers in bokeh when you don’t set an argument when you create the plot it picks a default and it’s not easy to change it later.

You may need to change where you call figure to set y_range to an initial value. Once that is done, the javascript callback should work.

Here is a code-snippet from my own code that successfully changes the y-range of a plot from a javascript callback:

    plot.y_range.start = range[key][0];
    plot.y_range.end = range[key][1];
    plot.y_range.reset_start = range[key][0];
    plot.y_range.reset_end = range[key][1];
    plot.y_range.change.emit()

In my case, I have a dictionary of ranges that I passed in to the callback as an argument rather than calculating the min/max on the fly, and the code sets both the currently displayed y-range and what the y-range should be if the user presses the reset button tool.

The only difference between what’s working in my code (besides setting y_range in the call to figure) and your javascript callback example is the plot.y_range.change.emit() call, so you might try that if it’s still not working.

duarbrun · July 21, 2023, 5:13pm

Thanks @jcarson ! I tried setting the y_range when calling figure, and it worked. I also added the plot.y_range.change.emit(), just in case.

Just for the sake of providing a complete feedback, the updated code is presented below:

multiselect = MultiSelect(title="Cities:", value=list(df['city'].unique()),
                          options=list(df['city'].unique()))

//specifying an initial range
range_y = [df['value'].min(), df['value'].max()]

plot = figure(width=400, height=250, title='Total monthly value - by city',
              x_axis_type='datetime', y_range=range_y)
plot.yaxis[0].formatter = NumeralTickFormatter(format="R$0,00")

colors = ['red', 'blue', 'green', 'purple', 'orange']

lines = []
for i, city in enumerate(df['city'].unique()):
    df_city = df[df['city'] == city]
    color = colors[i % len(colors)]  # Loop through the colors list cyclically
    line = plot.line(x='date', y='value', color=color, source=df_city, legend_label=city, line_width=2)
    lines.append(line)
    
#lines.append(plot.line(x='data', y='fatresidencial', color='blue', source=df[df['cidade'] == 'Renascença'], legend_label='Renascença', line_width=2))

plot.legend.location = 'top_left'
plot.legend.title = 'Cities'

source = ColumnDataSource(df)

# Update the plot's data source when the MultiSelect is interacted with
multiselect.js_on_change('value', CustomJS(args=dict(plot=plot, lines=lines, source=source, multiselect=multiselect), code="""
    const selected_cities = multiselect.value;
    const data = source.data;

    // Check if the original_data attribute exists; if not, set it to the current data
    if (!this.original_data) {
        this.original_data = {
            date: [...data['date']],
            value: [...data['value']],
            city: [...data['city']]
        };
    }

    // Filter the DataFrame on the server side using the selected cities
    const filtered_data = { date: [], value: [], city: [] };
    for (let i = 0; i < this.original_data['city'].length; i++) {
        if (selected_cities.includes(this.original_data['city'][i])) {
            filtered_data.date.push(this.original_data['date'][i]);
            filtered_data.value.push(this.original_data['value'][i]);
            filtered_data.city.push(this.original_data['city'][i]);
        }
    }

    if (filtered_data.value.length > 0) {
        // Calculate the new y_range based on the filtered data for visible cities
        const visible_data = filtered_data.value.filter((val, index) => selected_cities.includes(filtered_data.city[index]));
        const min_value = Math.min(...visible_data) - 5;
        const max_value = Math.max(...visible_data) + 5;
        plot.y_range.start = min_value;
        plot.y_range.end = max_value;

        // Update the plot data source with the filtered data
        source.data = filtered_data;
    } else {
        // Reset y_range to its original values if there is no visible data
        const min_value = Math.min(...this.original_data.value);
        const max_value = Math.max(...this.original_data.value);
        plot.y_range.start = min_value;
        plot.y_range.end = max_value;;
    }

    for (let i = 0; i < lines.length; i++) {
        const city = lines[i].data_source.data['city'][0];
        const visible = selected_cities.includes(city);
        lines[i].visible = visible;
    }
    //added to save changes
    plot.y_range.change.emit();
"""))

show(column(multiselect, plot))

I implemented both your solution and the one @Bryan provided, just to learn a little bit more.

Thank you both!

duarbrun · July 21, 2023, 5:16pm

Hello @Bryan !

I’m relatively new to Bokeh at all, since I started using it two or three days ago, and read a little about interactive legend feature, but didn’t understand it that much.

I tried using it to show/hide the lines, but couldn’t be able to control the y_range with it.

After seeing your answer I tried using range.only_visible = True with the Interactive legend and it worked! The complete code can be seen below:

#%% Ploting with legend.click_policy
nplot = figure(width=400, height=250, title='Total monthly value - by city', 
               x_axis_type="datetime", y_range=DataRange1d(only_visible=True))
nplot.yaxis[0].formatter = NumeralTickFormatter(format="$0.00", language="pt-br")

lines = []
for cidade, color in zip(df['city'].unique(), Spectral10):
    df_city = df[df['city'] == cidade]
    line = nplot.line(x='date', y='value', color=color, source=df_city, 
                      legend_label=cidade, line_width=2)
    lines.append(line)

nplot.legend.location = "top_left"
nplot.legend.click_policy="hide"
show(nplot)

I implemented both your solution and the one @jcarson provided, just to learn a little bit more.

Thank you both!

system · October 19, 2023, 5:17pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.