DateRangeSlider : value_as_datetime (pandas.datetime) vs CDSview as numpy.datetime64

awsbarker · November 11, 2021, 9:13am

I’m trying to setup a DateRangeSlider but keep getting conflicts between the DateRangeSlider.value_as_datetime which is in pandas.datetime.datetime format and the filtered view which uses a Boolean Filtered CDSview which is based on numpy.dateime64 format.

I can force a near solution by converting to numpy each time the DateRangeSlider field is changed i.e.

np.datetime64(dslider.value_as_datetime)

From the tuple you get a numpy.datetime64(‘2037-07-26T00:00:00.000000’) note the ‘ms’

So I end up with adjusting my boolean filter like this within the .on_change() function:

bools= (view_source.data['last'] <= np.datetime64(dslider.value_as_datetime[1])
) & (view_.source.data['last'] >= np.datetime64(dslider.value_as_datetime[0])

The CDSview values are infact numpy.datetime64(‘2021-07-25T00:00:00.000000000’) note the ‘ns’

But still have to deal with ‘ns’, ‘ms’ conversions and all the fun that brings…

I realise this is not really a Bokeh issue since it is a common headache elsewhere, I just wondered if an option to allow the DateRangeSlider.value to be converted to numpy.datetime64, something like slider_value_as_np_datetime, since datetime data will always be stored inside the CDSview as np.datetime64 in numpy.arrays?

Or am I missing something?

Bryan · November 11, 2021, 4:19pm

There is not currently, you could open a GitHub Issue to propose future feature development.

awsbarker · November 12, 2021, 8:14am

Thanks Bryan,

I will look into a github development request.

But first I think there is a bug that causes an updated DateRangeSlider to change the dtype of the source.data that is stored as numpy.datetime64 to numpy.float64.

It gets a bit scary when I start t read this could be to do with little and big indians …

If this is not obvious or know, I can strip down my code to a basic model so you can check, just let me know.

awsbarker · November 12, 2021, 9:13am

Bryan, it didn’t take too long to mock-up a dummy model

It just shows via print output the change in the source.data from np.datetime64 to np.float64 once the slider is updated,

# bokeh 2.4.1 pandas 1.3.4
from bokeh.palettes import Spectral5
from bokeh.layouts import row, column
from bokeh.models import ColumnDataSource, CDSView, DateRangeSlider, BooleanFilter, FactorRange
from bokeh.plotting import figure, show, curdoc
from bokeh.transform import factor_cmap
import pandas as pd
from datetime import datetime as dt, timedelta
import numpy as np

def getdata():
    orgDict = {1: 'GM_SL', 2: 'GM_AUS_NZ', 3: 'GM_CZ'}
    tuptup = [{'imei': 354033090692743, 'checked': dt(2021, 9, 27, 22, 11, 5), 'last': dt(2021, 9, 26, 13, 3, 1), 'count': 15, 'org': 1},
          {'imei': 354033090658264, 'checked': dt(2021, 9, 27, 22, 11, 5), 'last': dt(2021, 9, 24, 4, 22), 'count': 9, 'org': 2},
          {'imei': 354033090690671, 'checked': dt(2021, 9, 27, 22, 11, 5), 'last': dt(2021, 9, 24, 2, 58), 'count': 7, 'org': 3},
          {'imei': 354033090657951, 'checked': dt(2021, 9, 27, 22, 11, 5), 'last': dt(2021, 9, 24, 6, 7), 'count': 24, 'org': 1},
          {'imei': 354033090680722, 'checked': dt(2021, 9, 27, 22, 11, 5), 'last': dt(2021, 9, 27, 4, 15), 'count': 20, 'org': 2},
          {'imei': 354033090698245, 'checked': dt(2021, 9, 27, 22, 11, 5), 'last': dt(2021, 9, 24, 4, 38), 'count': 6, 'org': 3}]
    df1 = pd.DataFrame([i for i in tuptup]).astype({'imei': np.int64, 'count' : np.int16, 'org': np.int8}) # no effect 'last' : '<M8[s]'})
    df1.org = df1.org.map(orgDict) #.astype('category')
    df1.set_index('last', drop=False, inplace=True)
    df1.sort_index(inplace=True)

    orgs_wk = df1.groupby(['org', pd.Grouper(level=0, freq='W')])['imei'].nunique().to_frame()
    no_orgs = orgs_wk.index.unique('org').to_list()
    org_cmap = factor_cmap('xaxis', palette=Spectral5, factors= no_orgs, end=1)
    orgs_wk['xaxis'] = [(str(x[0]), x[1].date().strftime('%W %y')) for x in orgs_wk.index]
    ts = orgs_wk['ts'] = orgs_wk._get_label_or_level_values('last')  # converted to numpy.dt64 (int64) in CDS
    return orgs_wk, ts, org_cmap

orgs_wk, ts, org_cmap = getdata()
source_orgs_wk = ColumnDataSource(orgs_wk)

doem = (dt.now().replace(day=1) + timedelta(days=32)).replace(day=1, hour=0, minute=0, second=0, microsecond=0)
bool_orgs_wk = orgs_wk.index.get_level_values('last') <= doem #dev_org_wk.ts <= doem.date()  #) & (ts > dt(2021, 1, 1))

dslider = DateRangeSlider(title='Period', value=(ts.min(), ts.max()), start=ts.min(), end=ts.max(), step=30, width_policy='auto')  #dtype('<M8[s]')
view_orgs_wk = CDSView(source=source_orgs_wk, filters= [BooleanFilter(bool_orgs_wk)])

print(source_orgs_wk.data['ts'][1].dtype)

def updater(attr, old, new):
    print(source_orgs_wk.data['ts'][1], source_orgs_wk.data['ts'][1].dtype) # dt64 converted to np.float64)
    print(type(new[1]), type(dslider.value[1])) # int
    print(dslider.value[1] > source_orgs_wk.data['ts'][1]) # this works int64 >  np.float64 or ts.0

p = figure(plot_width=800, plot_height=300, title="test", x_range = FactorRange(*view_orgs_wk.source.data['xaxis'].tolist()), toolbar_location='above')
p.vbar(x='xaxis', top='imei', width=0.9, line_color="black",  source= source_orgs_wk, view = view_orgs_wk, fill_color=org_cmap) #, fill_color=index_cmap, )
p.xaxis[0].major_label_orientation = "vertical"

dslider.on_change('value_throttled', updater)

c = column(dslider, p,)
layout = column(c)
curdoc().add_root(layout)

Bryan · November 12, 2021, 8:58pm

Hi @awsbarker the actual underlying type used by BokehJS is always “floating point milliseconds since epoch” Although we try to maintain original types as much as possible on the Python side, I expect what is happening is that the values are being converted to the wire type at serialization, and since this is a Bokeh server app, being reflected back as a floating array at some point. I am not certain there is anything much we will able to do about that, but since you have an MRE you could start a GitHub development discussion since that is a more appropriate venue for dev-adjacent (non-support) questions.

system · February 10, 2022, 8:59pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.