Scatter markers time series plot interacting too slow

Cosmidark · December 29, 2022, 5:50pm

I have a very similar setup and I’m working with around 6 billion points on 45 graphs… I have had some luck altering the ColumnDataSource when zoom levels change. Essentially, the idea is to push up the data that exists only in the current view. Here is a working example of the idea. I list some caveats at the end.

import numpy as np
from bokeh.events import *
from bokeh.io import curdoc
from bokeh.layouts import column
from bokeh.models import ColumnDataSource, Range1d
from bokeh.models import RangeSlider
from bokeh.plotting import figure


def event_callback(event):
    # Here you figure out what data should be displayed in the browser and alter the source.
    indices = np.where((x >= event.x0) & (x <= event.x1))
    source.data = {'x': x[indices], 'y': y[indices]}


def update_value(attr, old, current):
    # The slider value has changed, so the range has changed, b/c of the js_link() calls.  
    # However, for some reason, the RangesUpdate callback doesn't happen.  So we "spoof" the event.
    class Event:
        x0 = current[0]
        x1 = current[1]

    event_callback(Event())


# Generate some data
x = np.arange(1000000)
y = np.tile([0, 1], int(len(x) / 2))

# Figure out your initial display range
# With the range, you want to constrain the max zoom level to prevent zooming out so far you get too much data
source = ColumnDataSource(dict(x=x[:150000], y=y[:150000]))
fig = figure(width=800, height=200, x_range=Range1d(0, 150000, bounds=(0, 1000000), max_interval=1500000, name="foo"))
fig.step('x', 'y', source=source)

# you will want some kind of slider to move around the data.  RangeTool or RangeSlider may be what you want.
slider = RangeSlider(start=0, end=1000000, value=(0, 15), width=790)
slider.js_link('value', fig.x_range, 'start', attr_selector=0)
slider.js_link('value', fig.x_range, 'end', attr_selector=1)

#  The slider will adjust the zoom window but this doesn't seem to fire RangesUpdate.  This is the workaround.
slider.on_change("value", update_value)
# RangesUpdate is the event that will let you know the viewing window has changed
fig.on_event(RangesUpdate, event_callback)

curdoc().add_root(column([fig, slider]))

Caveats:

You have to constrain the zoom level. If you let the user zoom out too far, you will be back where you started with too much data to plot.
You will find in some cases there is a visible drawing effect when you are panning or zooming. You can try some things like pushing data past the visible zoom level or caching some region of the data.
You may find that you need to add 1 point further on both ends of the plot to make the glyph render “end to end”. In this example with step, there is no line rendering b/c I’m not adding the values just outside the range.
If you zoom in far enough that 0 or 1 points are in the visible range, you may see no data plotted. Again, you should catch this case and add in the neighboring data on either end.

Honestly if you are zooming around in this much data, draw effects and constrained views don’t seem like a serious issue.

A further idea you could use instead of RangeSlider is to do like a datashader rendering of your full data set and use the RangeTool on that ImageRGB plot to allow the user to have a kind of thumbnail view of the entire dataset.