Bokeh and binary transfer Protocol: Example released

Dr. Volker Jeanisch and I are working with big datasets, so we looked for a faster way to send data from a webserver to the clients.

One month ago Dr. Volker Jaenisch released this description of the protocol on github:

Currently data is transfered between Bokeh-Server and client as a JSON structure.
While JSON is a established format for AJAX, JSON is not capable as high
performance data transfer mechanism for scientific applications.

Currently NP-Arrays that could be transfered in binary mode, were base 64 encoded and transfered as text embedded in a JSON structure.
On the JS side the decoding of base64 and a byte wise shuffling into JS typed Arrays takes lot of time and RAM.

As an example we implemented our own binary protocol for data transfer and it is much faster than Bokeh.
Our home-brew-data-structure to transfer data between Bokeh Server and client is crude but at least a order of magnitude faster than Bokeh.

From the user perspective one defines targets and associates data in a meta data dict:

        data_map = {
'source.data.x': x_data,
'source.data.index': x_data,
'source.data.y': y_data,
'source.data.y_above': y_above,
'source.data.y_below': y_below,
}
 

We align it in a binary data structure:

Length of Metadata
Metadata
dataset1
dataset2

  1. The pos/length of all data structures to be submitted is calculated
  2. The length of the meta data is coded into a string with 8 digits. So we beat endianess problems.
  3. The metadata is JSON coded and added binary.
  4. The metadata maps a Target (Plot1.source.data.X) to a Dataset defined by its position and length in the binary data structure.
  5. A Dataset is a binary representation of a numpy array fit for being 1:1 mapped to as JS typed array.
    On the JS side we disentangle the metadata first. Then we can map the
    binary data of the NP-Arrays to their targets JS typed arrays very fast, while respecting endianess.

This works quite well and we like to implement is to Bokeh.

Now we have finished our prototype. You can have a look at the full example visiting GitHub - Inqbus/inqbus.graphdemo: A little demo showing how bokeh can work with large datasets.

Best Regards,

Sandra Rum