DataTable column sorting unacceptably slow

Hello :slight_smile:

I’m trying to analyze a medium sized dataset (250,000 rows) with bokeh. I have several linked views and wanted to use a DataTable to quickly select data points with high/low values for a given attribute. So I’m using the sorting feature of the table (i.e. clicking the column name to sort by respective column) and it turns out to be extremely slow, freezing the application for several minutes.

I’m a little puzzled right now, since sorting or getting the sorted indices for 250k points is done in an instant when using np.argsort for example. I have created a small python script to reproduce this behavior. Is there something I’m missing? Maybe a specific configuration of DataTable that does the trick?

from bokeh.models import DataTable, ColumnDataSource, TableColumn
from bokeh.plotting import show
import numpy as np

if __name__ == '__main__':
    data = np.random.rand(6,250000)
    dataset = dict()
    for i in range(6):
        dataset[f"attr{i}"] = data[i,:]
    np.argsort(data[0,:])
    cds = ColumnDataSource(data=dataset)
    columns = [TableColumn(field=f"attr{i}", title=f"attr{i}") for i in range(6)]
    table = DataTable(source=cds, columns=columns)
    show(table)

This is on bokeh 3.6.2

I don’t have any immediate suggestions except to submit a GitHub Issue with full details.

Alright thanks for the quick response. I submitted an issue

Thanks @hageldave I’ll try to get the fix into a 3.6.3 release in the next week.

3.6.3 is released with a fix