Interactive Histograms not updating with sliders

I have tried working with interactive graphs and everything works really well, however bokeh module dosent have interactive histograms, and thus i used a function available online to make histograms and now i am trying to update the histogram plots with the sliders and it dosent seem to work well
below is the code, if someone could please help me make it work??

 import pandas as pd
 from pandas import DataFrame
 import numpy as np
 from bokeh.models import HoverTool , CategoricalColorMapper
 from bokeh.themes import built_in_themes
 from bokeh.io import curdoc
 from bokeh.models import Slider,DataTable, TableColumn
 from bokeh.plotting import figure, output_file, show, ColumnDataSource
 from bokeh.layouts import row, column, gridplot
 from bokeh.palettes import Category10_5, Category20_16
 from bokeh.models.widgets import Tabs, Panel
 from bokeh.core.properties import Float, Instance, Tuple, Bool, Enum
 from bokeh.models import InputWidget
 from bokeh.models.callbacks import Callback
 from bokeh.models.widgets import RangeSlider
 df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD')) 
 source = ColumnDataSource(df)
 A_Slider= RangeSlider(start=min(df['A']), end=max(df['A']), value=(min(df['A']),max(df['A'])), step=1, title='YY in mm')
 B_Slider = RangeSlider(start=min(df['B']), end=max(df['B']), value=(min(df['B']),max(df['B'])), step=1, title='ZZ')
 dff=df
 def callback(attr,new,old):
     A_s = A_Slider.value[0]
     A_e = A_Slider.value[1]
     B_s= B_Slider.value[0]
     B_e= B_Slider.value[1]
     dff= pd.DataFrame(df[(df.A >=A_s) & (df.A <= A_e) & (df.B >= B_s) & (df.B <= B_e)])
     source.data = ColumnDataSource.from_df(dff)
     #function ends here   
 A_Slider.on_change("value",callback)
 B_Slider.on_change("value",callback)
 # Histogram
 def interactive_histogram(dff,col,n_bins,bin_range,title,x_axis_label,x_tooltip):
     arr_hist, edges = np.histogram(dff[col],bins=n_bins,range=bin_range)
     # Column data source
     arr_df = pd.DataFrame({'count': arr_hist, 'left': edges[:-1], 'right': edges[1:]})
     arr_df['f_count'] = ['%d' % count for count in arr_df['count']]
     arr_df['f_interval'] = ['%d to %d ' % (left, right) for left, right in zip(arr_df['left'], arr_df['right'])]
     source = ColumnDataSource(arr_df)
     # Set up the figure same as before
     toollist = ['lasso_select', 'tap', 'reset', 'save','crosshair','wheel_zoom','pan','hover','box_select']
     p = figure(plot_width = 500, 
                plot_height = 500,
                title = title,
                x_axis_label = x_axis_label, 
                y_axis_label = 'Count',tools=toollist)
 
     # Add a quad glyph with source this time
     p.quad(bottom=0, 
            top='count', 
            left='left', 
            right='right', 
            source=source,
            fill_color='red',
            hover_fill_alpha=0.7,
            hover_fill_color='blue',
            line_color='black')
 
     # Add style to the plot
     p.title.align = 'center'
     p.title.text_font_size = '18pt'
     p.xaxis.axis_label_text_font_size = '12pt'
     p.xaxis.major_label_text_font_size = '12pt'
     p.yaxis.axis_label_text_font_size = '12pt'
     p.yaxis.major_label_text_font_size = '12pt'
 
     # Add a hover tool referring to the formatted columns
     hover = HoverTool(tooltips = [(x_tooltip, '@f_interval'),
                                   ('Count', '@f_count')])
 
     # Add the hover tool to the graph
     p.add_tools(hover)
     
     return p
 
 
 #
 binsize =10
 A_hist = interactive_histogram(df,'A',df['A'].nunique(),[min(df['A']),max(df['A'])],'A Histogram','A','A')
 B_hist = interactive_histogram(df,'B',df['B'].nunique(),[min(df['B']),max(df['B'])],'B','B ','B')
 #
 Graphs1 = row([A_hist,B_hist])
 Controls1= column([A_Slider,B_Slider])
 grid = gridplot([[Graphs1],
                 [Controls1]])
  
 curdoc().add_root(grid)
 show(grid)
 curdoc().title = "questiontrialsample"

currently, the histograms are plot and sliders appear and when I move the sliders, the plot does not update
I tried passing an updated data frame to the function histogram and it doesn’t seem to work

@VITTAL_S_S I have a few requests:

  • This is a community forum. I can’t be the only person to answer questions. We have to develop a culture that encourages group participation, so I ask that questions not start by tagging/requesting attention from any particular individual

  • Please format the code correctly. As you can see the code above is all prefixed with email quote merkers > so it can be cut and pasted into an editor, is not syntax highlighted, etc.

i hope i writequestions where you dont have to correct them any more
question updated

@VITTAL_S_S Do you know and understand how to look at the Bokeh server console output? There is often very useful information there. For instance, with this code, moving the first slider reports this error:

2019-07-23 12:16:14,899 error handling message Message ‘PATCH-DOC’ (revision 1) content: {‘events’: [{‘kind’: ‘ModelChanged’, ‘model’: {‘type’: ‘RangeSlider’, ‘id’: ‘1004’}, ‘attr’: ‘value’, ‘new’: [41.00000000000001, 99]}], ‘references’: }: NameError(“name ‘B_1_Slider’ is not defined”)

That’s telling you exactly some problem in your python code: NameError("name 'B_1_Slider' is not defined") and sure enough, your callback tries to set B_1_Slider which does not exist. There is a B_Slider though, so if I speculatively change your code to that, then the error then becomes:

AttributeError("‘DataFrame’ object has no attribute ‘B_1’")

Then if I make a (presumed) fix to that, then the next one is NameError("name 'B_s' is not defined")

None of these are really Bokeh questions at all. This is just generic Python debugging. Anyone writing Python code needs to learn how to interpret exception messages, track down and fix name errors, etc. I think you might be better served by seeking out online or instruction or tutorials in basic python programming that can help you develop these skills. This site really only caters to issues that are specific to Bokeh itself.

I gave your code a slight touch up, I believe it’s doing what you wanted now. There is most probably a better way of implementing what you want, but at least it kind of works. You would need to run your model with bokeh server for it to work.

import pandas as pd
import numpy as np
from bokeh.models import HoverTool
from bokeh.io import curdoc
from bokeh.plotting import figure, ColumnDataSource
from bokeh.layouts import row, column, gridplot
from bokeh.models.widgets import RangeSlider

class hist_data:
    def __init__(self, df, col, n_bins, bin_range):
        self.A_lwr = min(df['A'])
        self.A_upr = max(df['A'])
        self.B_lwr = min(df['B'])
        self.B_upr = max(df['B'])
        self.col = col
        self.n_bins = n_bins
        self.bin_range = bin_range
        self.original_df = df
        self.source = ColumnDataSource(self.create_hist_data(df))
    
    def filt_df(self):
        filt = (pd.DataFrame(self.original_df[(self.original_df.A >=self.A_lwr) & (self.original_df.A <= self.A_upr) & (self.original_df.B >= self.B_lwr) & (self.original_df.B <= self.B_upr)]))        
        print(f'{self.A_lwr} {self.A_upr} {self.B_lwr} {self.B_upr}')
        filt.shape
        return ColumnDataSource(self.create_hist_data(filt))

    def create_hist_data(self,df):
        arr_hist, edges = np.histogram(df[self.col],bins=self.n_bins, range=self.bin_range)
        arr_df = pd.DataFrame({'count': arr_hist, 'left': edges[:-1], 'right': edges[1:]})
        arr_df['f_count'] = ['%d' % count for count in arr_df['count']]
        arr_df['f_interval'] = ['%d to %d ' % (left, right) for left, right in zip(arr_df['left'], arr_df['right'])]        
        return (arr_df)
        
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD')) 
hist_data_A = hist_data(df,'A',df['A'].nunique(),[min(df['A']),max(df['A'])])
hist_data_B = hist_data(df,'B',df['B'].nunique(),[min(df['B']),max(df['B'])])

A_Slider= RangeSlider(start=min(df['A']), end=max(df['A']), value=(min(df['A']),max(df['A'])), step=1, title='YY in mm')
B_Slider = RangeSlider(start=min(df['B']), end=max(df['B']), value=(min(df['B']),max(df['B'])), step=1, title='ZZ')


def callback_A(attr,new,old):
    hist_data_A.A_lwr = new[0]
    hist_data_A.A_upr = new[1]
    hist_data_A.source = hist_data_A.filt_df()
    Graphs1.children[0] = plot_data_A()

def callback_B(attr,new,old):
    hist_data_B.B_lwr = new[0]
    hist_data_B.B_upr = new[1]
    hist_data_B.source = hist_data_B.filt_df()
    Graphs1.children[1] = plot_data_B()  

A_Slider.on_change("value",callback_A)
B_Slider.on_change("value",callback_B)
(df,'A',df['A'].nunique(),[min(df['A']),max(df['A'])])
(df,'B',df['B'].nunique(),[min(df['B']),max(df['B'])])


# Histogram
def interactive_histogram( hist_data, title,x_axis_label,x_tooltip):
     source = hist_data
     # Set up the figure same as before
     toollist = ['lasso_select', 'tap', 'reset', 'save','crosshair','wheel_zoom','pan','hover','box_select']
     p = figure(plot_width = 500, 
                plot_height = 500,
                title = title,
                x_axis_label = x_axis_label, 
                y_axis_label = 'Count',tools=toollist)
     
     # Add a quad glyph with source this time
     p.quad(bottom=0, 
            top='count', 
            left='left', 
            right='right', 
            source=source,
            fill_color='red',
            hover_fill_alpha=0.7,
            hover_fill_color='blue',
            line_color='black')
     
     # Add style to the plot
     p.title.align = 'center'
     p.title.text_font_size = '18pt'
     p.xaxis.axis_label_text_font_size = '12pt'
     p.xaxis.major_label_text_font_size = '12pt'
     p.yaxis.axis_label_text_font_size = '12pt'
     p.yaxis.major_label_text_font_size = '12pt'
     
     # Add a hover tool referring to the formatted columns
     hover = HoverTool(tooltips = [(x_tooltip, '@f_interval'),
                                   ('Count', '@f_count')])
     
     # Add the hover tool to the graph
     p.add_tools(hover)
     return p
 
binsize = 10
def plot_data_A():
    A_hist = interactive_histogram(hist_data_A.source, 'A Histogram','A','A')
    return A_hist

def plot_data_B():
    B_hist = interactive_histogram(hist_data_B.source, 'B Histogram','B ','B')
    return B_hist
     #
Graphs1 = row([plot_data_A(), plot_data_B()])
Controls1= column([A_Slider,B_Slider])
grid = gridplot([[Graphs1],
                 [Controls1]])
    
curdoc().add_root(grid)
curdoc().title = "questiontrialsample"
1 Like

@jnava thanks a lot for your code i did try this and understood how you are using functions to update individual graphs from the sliders
however, this serves only half of the purpose.
I have 2 sliders and both of the sliders have an influence on different graphs
suppose you change A slider, both data source A and B must change
I mean to say they are interdependent data sources, what modifications do you suggest i can implement this in the code you have presented??
basically i have 15 to 20 such graphs and 20 to 25 such data columns which are interdependent on each other
and i am scaling the program given by you here to the actual data set and i feel each of the graph does not have link of call back or update to all the sliders

@Bryan
I am sorry I had fixed this in my code but while i updated this error i dint update it on this question.

I believe you have all the elements to implement those requirements. In the example above if plot A and plot B are interdependent on the slider choices, then you can simply modify your callbacks and/or plotting functions. For example:

def callback_A(attr,new,old):
    hist_data_A.A_lwr = new[0]
    hist_data_A.A_upr = new[1]
    hist_data_A.source = hist_data_A.filt_df()
    hist_data_B.source = hist_data_B.filt_df()
    Graphs1.children[0] = plot_data_A()
    Graphs1.children[1] = plot_data_B()

Just a side note, I recently learnt that it’s always better to only update the column data source data property instead of creating a new object. Refer to this post.

@jnava i did try this by putting all the call backs into a single function and it doent seem to work
all the call backs in one, throws multiple error something like this

2019-07-26 17:53:21,462 Cannot apply patch to 1822 which is not in the document anymore

That message means you are trying to update some object that you have removed from the document (e.g. maybe a reference to a data source in a plot that you replaced, just a speculation).

so you mean to say every time i move the slider the updated data frame looses all the filtered values in the data frame based on slider input??
I have done exactly as stated in the answer without any deviations

No, I mean that you are replacing/deleting objects that you shouldn’t/don’t need to. The best-practice guidance for Bokeh apps is always to make the smallest change possible, and update properties of existing objects (rather than replacing objects wholesale).

Let’s try a different approach. You want a historgram that updates in response to some event. Here is exactly a complete example of an updating histogram that you can emulate:

https://github.com/bokeh/bokeh/blob/master/examples/app/selection_histogram.py

This particular example updates the histogram (actually, updates four histograms) in response to a selection, but exactly the same principles and structures apply to updating from other events, e.g. a slider value change. Namely, you probably should update the value of some source.data, and leave everything else alone.

@jnava I tried to implement your code as shown below
but it dosent seem to work
what are the changes that you would suggest to this code and where do i update the column data source that you mentioned??

import pandas as pd
import numpy as np
from bokeh.models import HoverTool
from bokeh.io import curdoc
from bokeh.plotting import figure, ColumnDataSource
from bokeh.layouts import row, column, gridplot
from bokeh.models.widgets import RangeSlider

class hist_data:
    def __init__(self, df, col, n_bins, bin_range):
        self.A_lwr = min(df['A'])
        self.A_upr = max(df['A'])
        self.B_lwr = min(df['B'])
        self.B_upr = max(df['B'])
        self.C_lwr = min(df['C'])
        self.C_upr = max(df['C'])
        self.D_lwr = min(df['D'])
        self.D_upr = max(df['D'])
        self.col = col
        self.n_bins = n_bins
        self.bin_range = bin_range
        self.original_df = df
        self.source = ColumnDataSource(self.create_hist_data(df))
    
    def filt_df(self):
        filt = (pd.DataFrame(self.original_df[(self.original_df.A >=self.A_lwr) & (self.original_df.A <= self.A_upr) & (self.original_df.B >= self.B_lwr) & (self.original_df.B <= self.B_upr)
        & (self.original_df.C >= self.C_lwr) & (self.original_df.C <= self.C_upr)& (self.original_df.D >= self.D_lwr) & (self.original_df.D <= self.D_upr)]))        
        print(f'{self.A_lwr} {self.A_upr} {self.B_lwr} {self.B_upr}')
        filt.shape
        return ColumnDataSource(self.create_hist_data(filt))

    def create_hist_data(self,df):
        arr_hist, edges = np.histogram(df[self.col],bins=self.n_bins, range=self.bin_range)
        arr_df = pd.DataFrame({'count': arr_hist, 'left': edges[:-1], 'right': edges[1:]})
        arr_df['f_count'] = ['%d' % count for count in arr_df['count']]
        arr_df['f_interval'] = ['%d to %d ' % (left, right) for left, right in zip(arr_df['left'], arr_df['right'])]        
        return (arr_df)
        
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD')) 
hist_data_A = hist_data(df,'A',df['A'].nunique(),[min(df['A']),max(df['A'])])
hist_data_B = hist_data(df,'B',df['B'].nunique(),[min(df['B']),max(df['B'])])
hist_data_C = hist_data(df,'C',df['C'].nunique(),[min(df['C']),max(df['C'])])
hist_data_D = hist_data(df,'D',df['D'].nunique(),[min(df['D']),max(df['D'])])
A_Slider = RangeSlider(start=min(df['A']), end=max(df['A']), value=(min(df['A']),max(df['A'])), step=1, title='YY in mm')
B_Slider = RangeSlider(start=min(df['B']), end=max(df['B']), value=(min(df['B']),max(df['B'])), step=1, title='ZZ')
C_Slider = RangeSlider(start=min(df['C']), end=max(df['C']), value=(min(df['C']),max(df['C'])), step=1, title='Zdd')
D_Slider = RangeSlider(start=min(df['D']), end=max(df['D']), value=(min(df['D']),max(df['D'])), step=1, title='Zcc')


def callback(attr,new,old):
    hist_data_A.A_lwr = new[0]
    hist_data_A.A_upr = new[1]
    hist_data_A.source = hist_data_A.filt_df()
    Graphs1.children[0] = plot_data_A()
    hist_data_B.B_lwr = new[0]
    hist_data_B.B_upr = new[1]
    hist_data_B.source = hist_data_B.filt_df()
    Graphs1.children[1] = plot_data_B()  
    hist_data_C.C_lwr = new[0]
    hist_data_C.C_upr = new[1]
    hist_data_C.source = hist_data_C.filt_df()
    Graphs2.children[0] = plot_data_C()
    hist_data_D.D_lwr = new[0]
    hist_data_D.D_upr = new[1]
    hist_data_D.source = hist_data_D.filt_df()
    Graphs2.children[1] = plot_data_D()
A_Slider.on_change("value",callback)
B_Slider.on_change("value",callback)
C_Slider.on_change("value",callback)
D_Slider.on_change("value",callback)  

(df,'A',df['A'].nunique(),[min(df['A']),max(df['A'])])
(df,'B',df['B'].nunique(),[min(df['B']),max(df['B'])])

(df,'C',df['C'].nunique(),[min(df['C']),max(df['C'])])
(df,'D',df['D'].nunique(),[min(df['D']),max(df['D'])])
# Histogram
def interactive_histogram( hist_data, title,x_axis_label,x_tooltip):
     source = hist_data
     # Set up the figure same as before
     toollist = ['lasso_select', 'tap', 'reset', 'save','crosshair','wheel_zoom','pan','hover','box_select']
     p = figure(plot_width = 500, 
                plot_height = 500,
                title = title,
                x_axis_label = x_axis_label, 
                y_axis_label = 'Count',tools=toollist)
     
     # Add a quad glyph with source this time
     p.quad(bottom=0, 
            top='count', 
            left='left', 
            right='right', 
            source=source,
            fill_color='red',
            hover_fill_alpha=0.7,
            hover_fill_color='blue',
            line_color='black')
     
     # Add style to the plot
     p.title.align = 'center'
     p.title.text_font_size = '18pt'
     p.xaxis.axis_label_text_font_size = '12pt'
     p.xaxis.major_label_text_font_size = '12pt'
     p.yaxis.axis_label_text_font_size = '12pt'
     p.yaxis.major_label_text_font_size = '12pt'
     
     # Add a hover tool referring to the formatted columns
     hover = HoverTool(tooltips = [(x_tooltip, '@f_interval'),
                                   ('Count', '@f_count')])
     
     # Add the hover tool to the graph
     p.add_tools(hover)
     return p
 
binsize = 10
def plot_data_A():
    A_hist = interactive_histogram(hist_data_A.source, 'A Histogram','A','A')
    return A_hist

def plot_data_B():
    B_hist = interactive_histogram(hist_data_B.source, 'B Histogram','B ','B')
    return B_hist

def plot_data_C():
    C_hist = interactive_histogram(hist_data_C.source, 'C Histogram','C','C')
    return C_hist

def plot_data_D():
    D_hist = interactive_histogram(hist_data_D.source, 'D Histogram','D ','D')
    return D_hist

Graphs1 = row([plot_data_A(), plot_data_B()])
Graphs2 = row([plot_data_C(), plot_data_D()])
Controls1= column([A_Slider,B_Slider])
Controls2= column([D_Slider,D_Slider])
grid = gridplot([[Graphs1],
                 [Graphs2],
                 [Controls1,Controls2]])
    
curdoc().add_root(grid)
curdoc().title = "qlsample"

@baryan
Thanks a lot for this code
however it seems to be much more complicated to replicate it to my case with multiple graphs and multiple sliders and write the same code compared to what is already presented before

You callback is doing this:

hist_data_A.source = hist_data_A.filt_df()

You should emulate the example I linked above, and update the .data properties of the existing sources, e.g.

hist_data_A.source.data  = new_filtered_data_dict

I would also suggest trying to start smaller and build up from there. Getting a small subset working (one plot, one slider) and then adding things will let you ask for help help in a more focused manner (“I made this small 5 line change and things stopped working”) than just posting 150 lines of code and “this does not work”.

@Bryan yan one plot with slider is working fine now
3 to 4 plots is what needs help with
in the current code shared by @jnava the main problem is that only first part of the graph gets updated and other graphs dont have any linked interactions
this is exact part which i need help with
First of all it would help me fix the code what i have posted and if someone could please help me understand why it does not work

@bryan @jnava
I tried to implement the program to a smaller data set as you advised me and all the graphs update with sliders however on another move of slider, some of the graphs disappear and the graphs go empty with Path doc error
and the following error

2019-08-06 10:52:46,820 Cannot apply patch to 1926 which is not in the document anymore

Could you please suggest me the implementation with a small code snippet??
as of what @Jnava gave me a code,
His code works without any problems however, the graphs update for the slider move and throws a path doc error
and slider just updates the graphs in that column given

Graphs1.children[0] = plot_data_A()

could you please guide me to implement both source.data on the new filtered df and also this on updating the plots

@Bryan with

new_filtered_data_dict

do you mean

hist_data_A.filt_df()

could you refer to the ones in program on what you are talking about??
i tried this

hist_data_A.source = hist_data_A.filt_df()

is this what you meant??
if yes this function returns new collumn data source very time the sliders are moved if i understand the program right
so this wont work
could you help me with a code snippet to implement what you intend to mean?

From a quick glance at the code, I suspect that for various combinations of your sliders there might be no intersecting data to create a histogram (but won’t know for sure without spending more time). You probably need to change your filter function such data the shape of the dataframe remains constant. Maybe change the dataframe values to np.nan if they meet the filtering condition (so that they are not considered towards the histogram count).

The other thing that goes in hand with this is that the histogram creation function can be simplified to only provide the data needed, and update the ColumnDataSource.data property.

def filt_df(self):
    df_temp = self.original_df.copy()
    df_temp.loc[df_temp['A'] >=self.A_lwr, 'A'] = np.nan
    df_temp.loc[df_temp['A'] <=self.A_upr, 'A'] = np.nan
    df_temp.loc[df_temp['B'] >=self.B_lwr, 'B'] = np.nan
    # ... plus all other filters...
    self.source.data = self.create_hist_data(df_temp)

def create_hist_data(self,df):
    count, edges = np.histogram(df[self.col],bins=self.n_bins, range=self.bin_range)
    bin_names = [f'{int(edges[n])} to {int(edges[n+1])}' for n in range(len(edges[:-1]))]
    arr_df = {'count': count, 'bin_names': bin_names, 'left':edges[:-1], 'right':edges[1:]}
    return (arr_df)

One other thing is that the range and number of bins should probably remain constant I see no good reason to change that dynamically, in this dummy example you have a range of 0 to 100. It would probably make more sense to instantiate the hist_data class with a static range and number of bins.

hist_data_dummy = hist_data(df,'A', 20, [0,100])
#or some other values that make sense

With these changes you should also make the corresponding change to the callback since the source data is being updated from the filter method.

Change:

    hist_data_xx.source = hist_data_xx.filt_df()    # Remove lines like this

To:

    hist_data_xx.filt_df()    # Replace with lines like this

No, I mean you should not try to replace (assign to) a .source object. You should modify the .data property of an existing data source. This is regardless of what or how you filter. I can’t tell you how to filter your data, only you know what you need.

There are lots of examples of setting source.data (and not replacing a source object altogether) in the examples directory of the GitHub repo:

https://github.com/bokeh/bokeh/tree/master/examples/app