Selection breaking when loading data to server with CustomJS

When selecting points the selection glyph used is the initial state of the point rather than the specified, i.e.

renderer = p.text(text='index', text_alpha=0.5, **kw)
renderer.nonselection_glyph.text_alpha = 0
# create the glyphs for selection; note direct call of Text(), not p.text()
text_glyph = Text(text_alpha=1, text_color='red')
renderer.selection_glyph = text_glyph

results in translucent black text, not red text. This is when using some customJS to upload data to the server, if that bit is skipped and the data loaded directly, selection results in red text but I have absolutely no idea why. All of the above code is after data has been loaded.

To test it you need to run bokeh server --show test3.py and upload two csv (will be written to os.getcwd() ) using the buttons after starting the server to get the broken version to load. It sometimes doesn’t work and you need to upload again. To get a working version uncomment the code at the bottom and run the server.

Thanks!

test3.py (5.95 KB)

For convenience here’s the full code:

import bokeh.plotting as blt

``
from bokeh.models.glyphs import Text

import pandas as pd

from bokeh.plotting import curdoc, figure, show, ColumnDataSource

import os, sys

from bokeh.models import CustomJS, Select

from bokeh.models.widgets import Button

from io import StringIO

import base64

from bokeh.models.selections import Selection

#os.chdir(’/Users/johnc.thomas/Dropbox/crispr/pycharm/bokeh_charts/tst’)

test_csv = “index,lfc,p\na,5,1\nb,3,4”

for fn in (‘somedat.csv’, ‘somedat2.csv’):

if not os.path.isfile(fn):

    with open(fn, 'w') as f:

        f.write(test_csv)

DEBUG = False

def debugprint(x):

if DEBUG:

    print(x)

return

file upload JS from https://github.com/bokeh/bokeh/issues/6096#issuecomment-299002827

JSloadfile = “”"

function read_file(filename) {

var reader = new FileReader();

reader.onload = load_handler;

reader.onerror = error_handler;

// readAsDataURL represents the file's data as a base64 encoded string

reader.readAsDataURL(filename);

}

function load_handler(event) {

var b64string = event.target.result;

file_source.data = {'file_contents' : [b64string], 'file_name':[input.files[0].name]};

file_source.trigger("change");

}

function error_handler(evt) {

if(evt.target.error.name == "NotReadableError") {

    alert("Can't read file!");

}

}

var input = document.createElement(‘input’);

input.setAttribute(‘type’, ‘file’);

input.onchange = function(){

if (window.FileReader) {

    read_file(input.files[0]);

} else {

    alert('FileReader is not supported in this browser');

}

}

input.click();

“”"

def open_table(file_io, fn):

"""return a pandas DF from a .csv, .tsv, .txt, or .xlsx

If an Excel file, the first sheet will be used.

txt and tsv means tab seperated."""

if fn.endswith('.xlsx'):

    df = pd.read_excel(file_io, index_col=0)

else:

    if fn.endswith('.txt') or fn.endswith('.tsv'):

        sep = '\t'

    elif fn.endswith('.csv'):

        sep = ','

    else:

        print("Only .csv, .tsv, .txt or .xlsx recognised.")

        raise TypeError

    df = pd.read_table(file_io, sep=sep, index_col=0)

return df

file_sources = [ColumnDataSource({‘file_contents’: , ‘file_name’: }),

            ColumnDataSource({'file_contents': [], 'file_name': []})]

colnames = dict(index=, lfc=, p=, lfc_2=, p_2=)

main_source = ColumnDataSource(colnames)

def file_callback(attr,old,new):

# check two files have been loaded

debugprint('file_callback() called')

fnames = [file_sources[i].data['file_name'] for i in (0,1)]

if not all(fnames):

    return

dfs = []

for file_source in file_sources:

    raw_contents = file_source.data['file_contents'][0]

    prefix, b64_contents = raw_contents.split(",", 1)

    file_contents = base64.b64decode(b64_contents)

    file_io = StringIO(file_contents.decode())

    dfs.append(open_table(file_io, file_source.data['file_name'][0]))

# for i in len(dfs)-1 if more than 2

dfs[1].columns = [c+'_2' for c in dfs[1].columns]

main_source.data = ColumnDataSource(pd.concat(dfs, axis=1, sort=True)).data

# # below doesn't fix the problem and results in piling up data

# global main_source

# main_source = ColumnDataSource(pd.concat(dfs, axis=1, sort=True))

populate_plots()

calls the func file_callback upon change of the specified attr: ‘data’

I think it’s actually the button.callback that modifies file_source.data

then the above func takes the raw data added to file_source

for fs in file_sources:

fs.on_change('data', file_callback)

lbutton = Button(label=“Upload”, button_type=“success”)

rbutton = Button(label=“Upload2”, button_type=“success”)

lbutton.callback = CustomJS(args=dict(file_source=file_sources[0]), code = JSloadfile)

rbutton.callback = CustomJS(args=dict(file_source=file_sources[1]), code = JSloadfile)

TOOLS = ‘box_select’

lp = figure(plot_width=400, plot_height=400, tools=TOOLS)

rp = figure(plot_width=400, plot_height=400, tools=TOOLS)

plots = blt.gridplot([[lp, rp],

              [lbutton, rbutton]])

curdoc().add_root(plots)

def populate_plots():

debugprint(main_source.data)

# figures created in the global above, this should be called every time

# main_source.data is changed by file_callback at least

lkw = dict(x='lfc', y='p', source=main_source)

rkw = dict(x='lfc_2', y='p_2', source=main_source)

for p, kw in (lp, lkw), (rp, rkw):

    p.circle(alpha=0.4, **kw)

    # plot the text labels, the renderer object is captured so that it can be meddled with

    # alpha is set to zero as we want the labels to be invisible to start off with

    renderer = p.text(text='index', text_alpha=0.5, **kw)

    renderer.nonselection_glyph.text_alpha = 0

    # create the glyphs for selection; note direct call of Text(), not p.text()

    text_glyph = Text(text_alpha=1, text_color='red')

    renderer.selection_glyph = text_glyph

DEBUG = True

# UNCOMMENTING BELOW RESULTS IN FUNCTIONAL CHART

def fake_file_callback():

df1 = pd.read_csv('somedat.csv', index_col=0)

df2 = pd.read_csv('somedat2.csv', index_col=0)

df2.columns = [c+'_2' for c in df2.columns]

main_source.data = ColumnDataSource(pd.concat([df1, df2], axis=1, sort=True)).data

fake_file_callback()

populate_plots()

That’s a lot of code so maybe I should summarise. The problem occurs when the data is added main_source using file_callback() but not when using *fake_file_callback(). *

In the former case: when the upload buttons are pressed the user must select a file and JS code adds data from the selected files to one of the ColumnDataSource in data_sources. Changes to the data_sources.data attribute calls file_callback(). When both data_sources have data, file_callback processes the binary data into a ColumnDataSource (via a Pandas dataframe) and assigns the local CDS.data to the global main_source.data.

In the case of fake_file_callback(): csv -> pd.DataFrame -> local ColumnDataSource -> main_source.data

So I found a way to make it work, in case anyone cares. I realised the glyph rendering part shouldn’t be called everytime the ColumnDataSource was updated, so I moved it out of a function, to the global scope. I’m not sure why this helped, and I still don’t know why it worked when using fake_file_callback() and not file_callback() but I’m not going to worry about it.