Bounding categorical x axis

grzegorz.malinowski · June 4, 2020, 10:54am

As in the title. Almost works, but needs(suppose) tiny adjustment.

p=5
category = ['PC ' + str(i) for i in range(1, p+1)]
pad = 5/100 
x_cat_range = FactorRange(factors=category, range_padding = pad, bounds = (0-pad, 5+pad))
plot = figure(x_range=x_cat_range)

p-himik · June 4, 2020, 11:16am

And what is the question?

grzegorz.malinowski · June 4, 2020, 11:22am

How to define bounds for that plot to be static.? If I zoom out, I 'd like to have the plot “blocked” within ranges/bounds (imagine the situation you don’t have pan tool, so the plot is static).
I can do it for continuous data, but what about categorical one?
I forgot the below line, it defines bar width.

bar = plot_2.vbar(x=‘category’, top=‘y_data’, source=source, width=0.8)

p-himik · June 4, 2020, 11:24am

Please provide a minimum reproducible example, including some test data and all of the required imports.

grzegorz.malinowski · June 4, 2020, 11:30am

sure, please play with turn on/off x_range

p = 5
data = np.array([3,2,5,6,2])
tools = 'pan, box_zoom, reset, save'
category = ['PC ' + str(i) for i in range(1, p+1)]
s_data = {'category': category,
          'y_data': data,
          'proportion': data/np.sum(data),
          'cum_proportion': np.cumsum(data/np.sum(data))}

source = ColumnDataSource(data=s_data)
pad = 5/100
y_range = DataRange1d(start=0, end=data.max()+data.max()*pad, bounds = (0, data.max()+data.max()*pad))
x_cat_range = FactorRange(factors=category, range_padding = pad, bounds = (-0.04, 6+pad))
plot_2 = figure(x_range=x_cat_range, y_range=y_range,
                frame_height=300, frame_width=300,
                tools=tools, match_aspect=False)

plot_2.title.text = 'Total Variance: {:.3}'.format(0.342133)
plot_2.xaxis.axis_label = 'Component'
plot_2.yaxis.axis_label = 'Eigenvalue'
lambda_bar_formatter = {'line_color': 'black',
                        'fill_color': 'CornflowerBlue',
                        'hover_color': 'red',
                        'hover_line_color': 'black',}
bar = plot_2.vbar(x='category', top='y_data', source=source, width=0.8, **lambda_bar_formatter)
plot_2.toolbar.logo = None
plot_2.xgrid.visible = False
plot_2.ygrid.visible = False
hover = HoverTool(
    tooltips=[('name', '@category'),
              ('value', '@y_data')],
    mode='mouse')

hover.renderers=[bar]
plot_2.add_tools(hover)
show(plot_2)

p-himik · June 4, 2020, 12:39pm

I’m afraid I still don’t understand what you want:

A minor issue - you didn’t provide the imports. Although I can fix it on my end, it’s not great
You write “If I zoom out”, but your example doesn’t have any zooming out functionality except for the “Reset” tool
“play with turn on/off x_range” - I have no idea what that means. You can’t “turn off” a range. Perhaps you can somehow modify the example to make your point obvious

grzegorz.malinowski · June 4, 2020, 12:47pm

Ok, Let’s make it simple. I’d like to reproduce 1th plot (bars are centered) by keeping always fixed distance between left/right bars and frame. 2nd pic is what I’d like to avoid. Pan tool is active in both cases.

So, how to properly define FactorRange?

x_cat_range = FactorRange(factors=category, range_padding = pad, bounds = (0-pad, p+pad))

bokeh_plot(5)
bokeh_plot(6)

p-himik · June 4, 2020, 1:10pm

OK, so the issue is not about zooming, but about bounds in general.

Take a look at the node in the bounds documentation: ranges — Bokeh 2.4.2 Documentation
As you can see, in mentions synthetic coordinates.
You provide the range with range_padding which is a subject to range_padding_units. And by default it’s "percent". So you cannot just plug some synthetic units percentage into a field that expects just the units.

You may want to set range_padding_units='absolute'. Alternatively, convert pad from percentage to the units when you pass it to range_padding.

grzegorz.malinowski · June 4, 2020, 1:14pm

Ok. I’ll be experimenting with that. Thank you.

grzegorz.malinowski · June 15, 2020, 7:02pm

Ok, still have some questions. Please check my code below. I’ve noticed 2 problems. I’d appeciate your support.

a) when btn_run.on_click(execute) is executed first time, the script doesn’t control the y_range.start / end and bounds
b) when btn_run.on_click(execute) is executed I have no idea how to reset tools settings (to properly display graph - in my case it applies to wheel_zoom since pan is fixed by range/bounds, but I suppose it affects all tools).

TOOLS = 'pan, wheel_zoom, undo, redo, reset, save'    

plot_1 = figure(frame_width=300, frame_height=300,
                x_range=FactorRange(),
                y_range = DataRange1d(),
                tools=TOOLS, match_aspect=False, name='plot_1')

def execute():
    ............
    #--> Bokeh update
    CDS_lambda.data.update({'category': category,
                            'y_data': np.diag(L)})

    plot_1.title.text = '% Variance Explaied: {:.3}'.format(np.diag(L).sum())
    plot_1.x_range.factors = category
    plot_1.x_range.bounds=(0, p)
    plot_1.y_range.range_padding_units='absolute'
    #plot_1.y_range.range_padding = 0
    plot_1.y_range.start=0
    plot_1.y_range.end=np.diag(L).max() + 0.05*np.diag(L).max()
    plot_1.y_range.bounds = (0, np.diag(L).max() + 0.05*np.diag(L).max())

btn_run.on_click(execute)

p-himik · June 15, 2020, 7:13pm

a) When btn_run.on_click(execute) line is executed, all it does is register the execute function as a callback that will be called when the button is clicked. Nothing more, so the execute function will not be run right away. If you need it to be run right away, just call it manually
b) I have no idea what you mean, especially by “how to reset tools settings”. The wheel zoom tool doesn’t have any state - it just changes the ranges. If you change the ranges yourself, it will just work

grzegorz.malinowski · June 15, 2020, 7:37pm

I can’t call the execute function manually because I want to have a choice what I want to do. Let’s go through it step by step. As for me, it should be configurable somehow.

ready state - ok
btn_run.on_click(execute) is executed first time - not ok (I’d like to control top and bottom padding)
playing with zoom - unless you press ‘reset’ tool there is no way to go back to the initial view (when you execute again btn_run.on_click(execute) to load new data, the x-axis scale is takes from the recent display unless, again, you press ‘reset’ tool) → but this raises the problem described in point 2

p-himik · June 15, 2020, 7:58pm

Just to reiterate - it has been executed before you even see the plot. What you probably mean is that the execute function has been executed as a result of you pressing the button. These are two very different things, you should not confound them.

Well, then just adjust range_padding. What’s the problem with that? If you want different padding at different sides then you’ll have to adjust start and end manually. A simpler Range1d model may be more suitable for the manual control.

If you pan a plot that uses DataRange1d then it won’t update the range once there’s new data - your interaction overrides the automatically computed range. You already adapt the Y range in the execute function - just update the X range as well, that’s it.

To summarize - you get the most flexibility by just manually adjusting start and end attributes of the Range1d model instances. Just compute the padding manually and set the attributes accordingly.

grzegorz.malinowski · June 15, 2020, 8:08pm

Please take into account that I have more than 1 example.
Are you able to play with my code and invent something that will make my life easier?

p-himik · June 15, 2020, 8:22pm

No idea what you mean by “example”, but whatever it is, you can just extract any repeated part in a function and just use that function.

I probably am, but I won’t, sorry. So far, you haven’t provided a single minimal working example that I can use without any modifications. And half of your questions contain so little detail that they leave me wondering what you actually want.

grzegorz.malinowski · June 17, 2020, 1:39pm

I was trying to find the solution but without success. Please check the code and play with buttons and tools. I’m still facing same problems with plot_1 as described in points 2 and 3. According to the console data has correct values.

index.html

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    {{ bokeh_css }}
    {{ bokeh_js }}
    <link rel="stylesheet" href="bokeh/static/css/style.css"/>
  </head>
  <body>


{% extends base %}
{% block contents %}

<div class='grid'>
    <div class='panel'>
        <p class='axs'>DATA</p><hr><br />
        {{ embed(roots.menu_dataset) }}

    </div>
    <div class='plot_1'>{{ embed(roots.plot_1) }}</div>
    <div class='plot_2'>{{ embed(roots.plot_2) }}</div>
</div>

{% endblock %}
</body>
</html>

main.py

import numpy as np
np.set_printoptions(precision=4, suppress=True)
import pandas as pd


from bokeh.io import curdoc
from bokeh.models import FactorRange, DataRange1d, ColumnDataSource
from bokeh.models import Select, Button, RadioButtonGroup, Column, Div, HoverTool, Span
from bokeh.plotting import figure


#--> data loading
def load_data(database):
    if database == 'example_1':
        a = [1,2,3]
        b = [2,4,2]
        c = [3,2,5]
        d = [4,5,1]
        e = [5,1,2]
        data = np.vstack((a,b,c,d,e))
    elif database == 'example_2':
        a = [1,3,1]
        b = [4,4,3]
        c = [2,2,2]
        d = [3,2,4]
        e = [5,5,0]
        f = [60,2,3]
        data = np.vstack((a,b,c,d,e,f))
    return data


#--> data preprocessing
def preprocess_data(data, standardize):
    mu = np.mean(data, axis=0)

    data_cen = data - mu
    data_std = data_cen / np.std(data, axis=0, ddof=1)

    if standardize == 'center':
        return data_cen
    elif standardize == 'scale':
        return data_std


#--> decomposition
def SVD_decompose(data):
    L = data.T@data
    return L



def initialize():
    # data setting
    data = load_data(select_data.value)
    standardize = preprocess_[btng_norm.active]
    X = preprocess_data(data, standardize)
    n = X.shape[0]
    p = X.shape[1]
    component = ['#' + str(i) for i in range(1, p+1)]
    L = SVD_decompose(X)
    # CDS
    CDS_lambda.data.update({'component': component,
                            'lambda': np.diag(L),
                            'proportion': np.diag(L)/L.trace(),
                            'cumsum_proportion': np.cumsum(np.diag(L)/L.trace())})
    # ranges
    x_.update(factors = component,
              bounds = (0, len(component)))
    y_1.update(start = 0,
               end = 1.05*np.diag(L).max(),
               bounds = (0, 1.05*np.diag(L).max()))


examples_ = ['example_1', 'example_2']              #--> example_1
preprocess_ = ['center', 'scale']                   #--> scale


select_data = Select(title = 'Select Data', value = examples_[0], options = examples_)
btng_norm = RadioButtonGroup(labels = preprocess_, active = 1, margin=(0, 5, 5, 5))
btn_run = Button(label='execute', css_classes=['pad', 'btn_style'], margin=(0, 5, 5, 5))




data_lambda = {'component': [],
               'lambda': [],
               'proportion': [],
               'cumsum_proportion': []}
CDS_lambda = ColumnDataSource(data=data_lambda)

x_ = FactorRange()
y_1 = DataRange1d()
y_2 = DataRange1d(start=0, end=1.05, bounds=(0, 1.05))

initialize()
print(CDS_lambda.data)


# plot_1
plot_1 = figure(frame_width=300, frame_height=300, match_aspect=False,
                x_range=x_, y_range=y_1,
                name='plot_1')

plot_1.title.text = 'Total Variance: {:0.3f}'.format(CDS_lambda.data['lambda'].sum())
plot_1.xaxis.axis_label = 'Components'
plot_1.yaxis.axis_label = 'Eigenvalue contribution'
plot_1.xgrid.visible = None
plot_1.ygrid.visible = None


bar = plot_1.vbar(x='component', top='lambda', source=CDS_lambda, width=1)
avg = plot_1.add_layout(Span(location=CDS_lambda.data['lambda'].mean()))

hover_lambda = HoverTool(
    tooltips=[('PC', '@component'),
              (chr(955), '@lambda{0.000}')],
    mode='mouse')
hover_lambda.renderers=[bar]
plot_1.add_tools(hover_lambda)


# plot_2
plot_2 = figure(frame_width=300, frame_height=300, match_aspect=False,
                x_range=x_, y_range=y_2,
                name='plot_2')

plot_2.title.text = '% Variance Explained'
plot_2.xaxis.axis_label = 'Components'
plot_2.yaxis.axis_label = 'Eigenvalue contribution'
plot_2.xgrid.visible = None
plot_2.ygrid.visible = True
#plot_2.legend.location = (180,30)
#plot_2.legend.click_policy='hide'


plot_2.line(x='component', y='proportion', source=CDS_lambda, line_color='blue', legend_label='proportion')
pro = plot_2.circle(x='component', y='proportion', source=CDS_lambda, line_color='blue', legend_label='proportion', size=7, fill_color='white', hover_fill_color='red')

plot_2.line(x='component', y='cumsum_proportion', source=CDS_lambda, line_color='black', legend_label='cumulative')
cum = plot_2.circle(x='component', y='cumsum_proportion', source=CDS_lambda, line_color='black', legend_label='cumulative', size=7, fill_color='white', hover_fill_color='red')


hover_variance = HoverTool(
    tooltips=[('PC', '@component'),
              ('proportion', '@proportion{0.000}'),
              ('cumulative', '@cumsum_proportion{0.000}')],
    mode='vline')
hover_variance.renderers=[pro, cum]
plot_2.add_tools(hover_variance)


def execute():
    data = load_data(select_data.value)
    standardize = preprocess_[btng_norm.active]
    X = preprocess_data(data, standardize)
    n = X.shape[0]
    p = X.shape[1]
    component = ['#' + str(i) for i in range(1, p+1)]
    L = SVD_decompose(X)

    # CDS
    CDS_lambda.data.update({'component': component,
                            'lambda': np.diag(L),
                            'proportion': np.diag(L)/L.trace(),
                            'cumsum_proportion': np.cumsum(np.diag(L)/L.trace())})

    x_.update(factors = component,
              bounds = (0, len(component)))
    y_1.update(start = 0,
               end = 1.05*np.diag(L).max(),
               bounds = (0, 1.05*np.diag(L).max()))


    plot_1.title.text = '% Variance Explaied: {:0.3f}'.format(np.diag(L).sum())
    print()
    print(CDS_lambda.data)
    print(x_.bounds)
    print(y_1.bounds)
    print(y_1.start)
    print(y_1.end)
    print()

btn_run.on_click(execute)

print(x_.bounds)
print(y_1.bounds)
print(y_1.start)
print(y_1.end)
print()


#--> sidebar
menu_dataset = Column(select_data, Div(text="""Preprocess Data""", margin=(5, 5, 0, 5), css_classes=['missing_labels']),
                      btng_norm, Div(text="""Analyse Data""", margin=(5, 5, 0, 5), css_classes=['missing_labels']),
                      btn_run, name='menu_dataset', width=250)


curdoc().add_root(menu_dataset)
curdoc().add_root(plot_1)
curdoc().add_root(plot_2)
curdoc().title = "My dashboard"

grzegorz.malinowski · June 18, 2020, 2:04pm

Could anyone help?

Bryan · June 18, 2020, 4:31pm

If you are setting numeric ranges manually, you should use Range1d not DataRange1d. The data ranges are one of the most complicated things in Bokeh because they have to mediate computed data envelopes, user stipulated bounds, changes from interactive tools, auto-following, aspect preservation and more. There are just many possible combinations of behaviours and not all of them make sense and not all of them have gotten much if any testing or coverage. If you don’t need any of the features of a data range, it will always be best to jettison all that complexity and use the far simpler Range1d.

If I replace with Range1d in your code then things seem to update reasonably (The code and app are complicated so I’m not 100% sure what it is supposed to do).

grzegorz.malinowski · June 18, 2020, 7:40pm

Thanks Brian. I’ll check it. Hope we’ll all be enjoying very soon more ,non for dev, detailed documentation.

grzegorz.malinowski · June 18, 2020, 7:43pm

works perfect. Thank you once again