Need help on proper/efficient way to update plot with Bokeh Server

Hi, I recently was introduced to Bokeh and love its potential! I work at a pharma company and have set out to create a bokeh server app for data visualization. I have had a hard time efficiently refreshing plots in the client-side browser. I coded a smaller app that replicates the issue I am having. The app produces a scatter plot using (x,y) coordinates from a data frame. I want to add interactions such that when a user either (1) selects points on the plot or (2) enters the name of comma-separated points in the text box (ie. “p55, p1234”), then those points will turn red on the scatter plot.
I had pasted the code here: http://pastebin.com/JvQ1UpzY
The code that updates the graph can be found in the function refresh_graph().

I have found that if I drill into the ColumnDataSource and change only a few values, they do not get updated in the graph on client-browser (Strategy 1). The only way I can seem to make the client-browser update is if I completely reassign the value to the plot’s figure (Strategy 3). However, even though Strategy #3 works, it takes a relatively long time (~500ms vs ~1ms).

Thanks for your help!

Hi,

0.11 is not very smart about updating data - all updates are on the attribute level, which means the change notification is for ColumnDataSource’s data attribute, and if anything changes we send the entire data again.

In the short term, the most important thing to do is probably to update everything you plan to change about data and only then set the new data, so the change notification happens once. If you write code like this, there are two notifications:

source.data[‘x’] = array1 # notifies by resending entire data

source.data[‘y’] = array2 # notifies again resending entire data again

This is both inefficient and looks ugly (two visible updates to the plot). Better is to first make a whole new dict and then:

source.data = new_data # notify one time

We intercept changes to the dict (setting a whole new column) but we don’t wrap the individual data arrays so if you modify them “underneath” the data source, we don’t know about the changes I guess. It is possible to do a source.trigger('data', source.data, source.data) which should cause things to be sent on the wire by forcing a change notification. Making this work magically might create an unfortunate performance penalty… it may not make sense.

Two PRs that would be great to have in the future:

  • support partial updates to dictionaries and lists. We track changes to dicts and lists via https://github.com/bokeh/bokeh/blob/3e2a9737d510abede1be090e4ee9fa6dcab037aa/bokeh/core/property_containers.py and the information that we’ve appended some elements or modified a single element in-place could be communicated through _notify_owners into properties.py and ultimately result in extra information attached to the ModelChangedEvent emitted by Document. Then the websocket protocol could send a message with only the new or changed elements.

  • support binary encoding; rather than inserting the array into the JSON, insert a marker in the JSON that says "byte array will follow which has type ", then append to the websocket message said byte array. This should only happen when the JSON is websocket-bound and not when it’s going to a file or elsewhere, so it’s probably an optional feature of Model.to_json and Document.to_json.

Those two changes should be a nice performance boost in many situations, though of course there are limits (no matter what we do, if a callback changes lots of values in a really large data array, we’ll be shoveling a lot of data over the socket). But optimizing append or single-dict patches should be helpful for things like a streaming data source, and binary encoding should be a nice constant-factor improvement in performance.

Havoc

···

On Sat, Jan 23, 2016 at 8:51 PM, wcopeland [email protected] wrote:

Hi, I recently was introduced to Bokeh and love its potential! I work at a pharma company and have set out to create a bokeh server app for data visualization. I have had a hard time efficiently refreshing plots in the client-side browser. I coded a smaller app that replicates the issue I am having. The app produces a scatter plot using (x,y) coordinates from a data frame. I want to add interactions such that when a user either (1) selects points on the plot or (2) enters the name of comma-separated points in the text box (ie. “p55, p1234”), then those points will turn red on the scatter plot.
I had pasted the code here: http://pastebin.com/JvQ1UpzY
The code that updates the graph can be found in the function refresh_graph().

I have found that if I drill into the ColumnDataSource and change only a few values, they do not get updated in the graph on client-browser (Strategy 1). The only way I can seem to make the client-browser update is if I completely reassign the value to the plot’s figure (Strategy 3). However, even though Strategy #3 works, it takes a relatively long time (~500ms vs ~1ms).

Thanks for your help!

You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/5b2120dc-242d-4f96-b9ec-2f18a3b1250a%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Havoc Pennington

Senior Software Architect

Havoc,

Would dict.update({‘x’: array1, ‘y’: array2}) trigger a single remote update, and avoid having to rebuild a new dict?

Ben

···

On Sun, Jan 24, 2016 at 8:37 AM, Havoc Pennington [email protected] wrote:

Hi,

0.11 is not very smart about updating data - all updates are on the attribute level, which means the change notification is for ColumnDataSource’s data attribute, and if anything changes we send the entire data again.

In the short term, the most important thing to do is probably to update everything you plan to change about data and only then set the new data, so the change notification happens once. If you write code like this, there are two notifications:

source.data[‘x’] = array1 # notifies by resending entire data

source.data[‘y’] = array2 # notifies again resending entire data again

This is both inefficient and looks ugly (two visible updates to the plot). Better is to first make a whole new dict and then:

source.data = new_data # notify one time

We intercept changes to the dict (setting a whole new column) but we don’t wrap the individual data arrays so if you modify them “underneath” the data source, we don’t know about the changes I guess. It is possible to do a source.trigger('data', source.data, source.data) which should cause things to be sent on the wire by forcing a change notification. Making this work magically might create an unfortunate performance penalty… it may not make sense.

Two PRs that would be great to have in the future:

  • support binary encoding; rather than inserting the array into the JSON, insert a marker in the JSON that says "byte array will follow which has type ", then append to the websocket message said byte array. This should only happen when the JSON is websocket-bound and not when it’s going to a file or elsewhere, so it’s probably an optional feature of Model.to_json and Document.to_json.

Those two changes should be a nice performance boost in many situations, though of course there are limits (no matter what we do, if a callback changes lots of values in a really large data array, we’ll be shoveling a lot of data over the socket). But optimizing append or single-dict patches should be helpful for things like a streaming data source, and binary encoding should be a nice constant-factor improvement in performance.

Havoc

You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/CAC%2B_nE0sgAK-rKcsf27O3wA8gQe4SF8m%2BY4Ns%3D2fg0oPQ1avbw%40mail.gmail.com.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

On Sat, Jan 23, 2016 at 8:51 PM, wcopeland [email protected] wrote:

Hi, I recently was introduced to Bokeh and love its potential! I work at a pharma company and have set out to create a bokeh server app for data visualization. I have had a hard time efficiently refreshing plots in the client-side browser. I coded a smaller app that replicates the issue I am having. The app produces a scatter plot using (x,y) coordinates from a data frame. I want to add interactions such that when a user either (1) selects points on the plot or (2) enters the name of comma-separated points in the text box (ie. “p55, p1234”), then those points will turn red on the scatter plot.
I had pasted the code here: http://pastebin.com/JvQ1UpzY
The code that updates the graph can be found in the function refresh_graph().

I have found that if I drill into the ColumnDataSource and change only a few values, they do not get updated in the graph on client-browser (Strategy 1). The only way I can seem to make the client-browser update is if I completely reassign the value to the plot’s figure (Strategy 3). However, even though Strategy #3 works, it takes a relatively long time (~500ms vs ~1ms).

Thanks for your help!

You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/5b2120dc-242d-4f96-b9ec-2f18a3b1250a%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.


Havoc Pennington

Senior Software Architect

It should, yes:

https://github.com/bokeh/bokeh/blob/master/bokeh/core/property_containers.py#L159

(the dict will be wrapped in that dict proxy class so we see changes)

Havoc

···

On Sun, Jan 24, 2016 at 8:37 AM, Havoc Pennington [email protected] wrote:

Hi,

0.11 is not very smart about updating data - all updates are on the attribute level, which means the change notification is for ColumnDataSource’s data attribute, and if anything changes we send the entire data again.

In the short term, the most important thing to do is probably to update everything you plan to change about data and only then set the new data, so the change notification happens once. If you write code like this, there are two notifications:

source.data[‘x’] = array1 # notifies by resending entire data

source.data[‘y’] = array2 # notifies again resending entire data again

This is both inefficient and looks ugly (two visible updates to the plot). Better is to first make a whole new dict and then:

source.data = new_data # notify one time

We intercept changes to the dict (setting a whole new column) but we don’t wrap the individual data arrays so if you modify them “underneath” the data source, we don’t know about the changes I guess. It is possible to do a source.trigger('data', source.data, source.data) which should cause things to be sent on the wire by forcing a change notification. Making this work magically might create an unfortunate performance penalty… it may not make sense.

Two PRs that would be great to have in the future:

  • support binary encoding; rather than inserting the array into the JSON, insert a marker in the JSON that says "byte array will follow which has type ", then append to the websocket message said byte array. This should only happen when the JSON is websocket-bound and not when it’s going to a file or elsewhere, so it’s probably an optional feature of Model.to_json and Document.to_json.

Those two changes should be a nice performance boost in many situations, though of course there are limits (no matter what we do, if a callback changes lots of values in a really large data array, we’ll be shoveling a lot of data over the socket). But optimizing append or single-dict patches should be helpful for things like a streaming data source, and binary encoding should be a nice constant-factor improvement in performance.

Havoc

You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/CAC%2B_nE0sgAK-rKcsf27O3wA8gQe4SF8m%2BY4Ns%3D2fg0oPQ1avbw%40mail.gmail.com.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

On Sat, Jan 23, 2016 at 8:51 PM, wcopeland [email protected] wrote:

Hi, I recently was introduced to Bokeh and love its potential! I work at a pharma company and have set out to create a bokeh server app for data visualization. I have had a hard time efficiently refreshing plots in the client-side browser. I coded a smaller app that replicates the issue I am having. The app produces a scatter plot using (x,y) coordinates from a data frame. I want to add interactions such that when a user either (1) selects points on the plot or (2) enters the name of comma-separated points in the text box (ie. “p55, p1234”), then those points will turn red on the scatter plot.
I had pasted the code here: http://pastebin.com/JvQ1UpzY
The code that updates the graph can be found in the function refresh_graph().

I have found that if I drill into the ColumnDataSource and change only a few values, they do not get updated in the graph on client-browser (Strategy 1). The only way I can seem to make the client-browser update is if I completely reassign the value to the plot’s figure (Strategy 3). However, even though Strategy #3 works, it takes a relatively long time (~500ms vs ~1ms).

Thanks for your help!

You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/5b2120dc-242d-4f96-b9ec-2f18a3b1250a%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.


Havoc Pennington

Senior Software Architect

Thank you all for the very fast responses!

I have tried all of the suggested ways, but it still does not work properly. The code for the refresh function is now:

updated_color_list = ['steelblue'] * len(self.graph_plot.data_source.data['color'])
for idx in new_idxs:
    updated_color_list[idx] = 'red'
self.graph_source.data.update({'color': updated_color_list})
self.graph_source.trigger('data', self.graph_source.data, self.graph_source.data)

Interestingly, this works, but on a “selection delay”. It almost seems as if the default selection behavior is overriding my changes. I have added a set of pictures that can explain what I mean.
Figure 1. I zoom in on the coordinates between (0,0) and (50,50).
Figure 2. I select coordinates between (10,20) and (30,40). I would expect all those coordinates to turn RED, but all remain BLUE.
Figure 3. I select coordinates between (10,10) and (40,40). I would expect all those coordinate to turn RED. The points from the previous selection - those between (10,20) and (30,40) - turn RED. The additional coordinates are BLUE.
Figure 4. I reset the zoom to its original scope and see that

On Saturday, January 23, 2016 at 5:52:00 PM UTC-8, wcopeland wrote:Hi, I recently was introduced to Bokeh and love its potential! I work at a pharma company and have set out to create a bokeh server app for data visualization. I have had a hard time efficiently refreshing plots in the client-side browser. I coded a smaller app that replicates the issue I am having. The app produces a scatter plot using (x,y) coordinates from a data frame. I want to add interactions such that when a user either (1) selects points on the plot or (2) enters the name of comma-separated points in the text box (ie. “p55, p1234”), then those points will turn red on the scatter plot.
I had pasted the code here: http://pastebin.com/JvQ1UpzY
The code that updates the graph can be found in the function refresh_graph().

I have found that if I drill into the ColumnDataSource and change only a few values, they do not get updated in the graph on client-browser (Strategy 1). The only way I can seem to make the client-browser update is if I completely reassign the value to the plot’s figure (Strategy 3). However, even though Strategy #3 works, it takes a relatively long time (~500ms vs ~1ms).

Thanks for your help!

···

On Sunday, January 24, 2016 at 12:03:04 PM UTC-8, Havoc Pennington wrote:

It should, yes:

https://github.com/bokeh/bokeh/blob/master/bokeh/core/property_containers.py#L159

(the dict will be wrapped in that dict proxy class so we see changes)

Havoc

On Jan 24, 2016, at 11:40 AM, Ben Cipollini [email protected] wrote:

Havoc,

Would dict.update({‘x’: array1, ‘y’: array2}) trigger a single remote update, and avoid having to rebuild a new dict?

Ben

On Sun, Jan 24, 2016 at 8:37 AM, Havoc Pennington [email protected] wrote:

Hi,

0.11 is not very smart about updating data - all updates are on the attribute level, which means the change notification is for ColumnDataSource’s data attribute, and if anything changes we send the entire data again.

In the short term, the most important thing to do is probably to update everything you plan to change about data and only then set the new data, so the change notification happens once. If you write code like this, there are two notifications:

source.data[‘x’] = array1 # notifies by resending entire data

source.data[‘y’] = array2 # notifies again resending entire data again

This is both inefficient and looks ugly (two visible updates to the plot). Better is to first make a whole new dict and then:

source.data = new_data # notify one time

We intercept changes to the dict (setting a whole new column) but we don’t wrap the individual data arrays so if you modify them “underneath” the data source, we don’t know about the changes I guess. It is possible to do a source.trigger('data', source.data, source.data) which should cause things to be sent on the wire by forcing a change notification. Making this work magically might create an unfortunate performance penalty… it may not make sense.

Two PRs that would be great to have in the future:

  • support binary encoding; rather than inserting the array into the JSON, insert a marker in the JSON that says "byte array will follow which has type ", then append to the websocket message said byte array. This should only happen when the JSON is websocket-bound and not when it’s going to a file or elsewhere, so it’s probably an optional feature of Model.to_json and Document.to_json.

Those two changes should be a nice performance boost in many situations, though of course there are limits (no matter what we do, if a callback changes lots of values in a really large data array, we’ll be shoveling a lot of data over the socket). But optimizing append or single-dict patches should be helpful for things like a streaming data source, and binary encoding should be a nice constant-factor improvement in performance.

Havoc

On Sat, Jan 23, 2016 at 8:51 PM, wcopeland [email protected] wrote:

Hi, I recently was introduced to Bokeh and love its potential! I work at a pharma company and have set out to create a bokeh server app for data visualization. I have had a hard time efficiently refreshing plots in the client-side browser. I coded a smaller app that replicates the issue I am having. The app produces a scatter plot using (x,y) coordinates from a data frame. I want to add interactions such that when a user either (1) selects points on the plot or (2) enters the name of comma-separated points in the text box (ie. “p55, p1234”), then those points will turn red on the scatter plot.
I had pasted the code here: http://pastebin.com/JvQ1UpzY
The code that updates the graph can be found in the function refresh_graph().

I have found that if I drill into the ColumnDataSource and change only a few values, they do not get updated in the graph on client-browser (Strategy 1). The only way I can seem to make the client-browser update is if I completely reassign the value to the plot’s figure (Strategy 3). However, even though Strategy #3 works, it takes a relatively long time (~500ms vs ~1ms).

Thanks for your help!

You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/5b2120dc-242d-4f96-b9ec-2f18a3b1250a%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.


Havoc Pennington

Senior Software Architect

You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/CAC%2B_nE0sgAK-rKcsf27O3wA8gQe4SF8m%2BY4Ns%3D2fg0oPQ1avbw%40mail.gmail.com.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/CANgTvZgxK7xv6myVdX6BjyH%2B3mRhPqaGT5bbsrMcOh-PpiUcDg%40mail.gmail.com.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

It almost seems as if the default selection behavior is overriding my changes.

That's probably exactly what is happening. If you don't want the default selection behavior (because you want to control it yourself, e.g.), then you need to turn it off, explicitly. By default there is only a policy for "nonselected" points, so:

  r = fig.circle(...)

  r.nonselection_glyph = None

You can see more information about selection/nonselection/hover policies here:

  http://bokeh.pydata.org/en/latest/docs/user_guide/styling.html#selected-and-unselected-glyphs

I will also say that if you are just wanting to update the visual properties of glyphs based on selections, using the methods in that section will be much simpler and efficient (since they happen entirely on the client).

Thanks,

Bryan

···

On Jan 24, 2016, at 4:07 PM, wcopeland <[email protected]> wrote:

Thank you all for the very fast responses!

I have tried all of the suggested ways, but it still does not work properly. The code for the refresh function is now:

updated_color_list = ['steelblue'] * len(self.graph_plot.data_source.data['color'])
for idx in new_idxs:
    updated_color_list[idx] = 'red'
self.graph_source.data.update({'color': updated_color_list})
self.graph_source.trigger('data', self.graph_source.data, self.graph_source.data)

Interestingly, this works, but on a "selection delay". It almost seems as if the default selection behavior is overriding my changes. I have added a set of pictures that can explain what I mean.

Figure 1. I zoom in on the coordinates between (0,0) and (50,50).
Figure 2. I select coordinates between (10,20) and (30,40). I would expect all those coordinates to turn RED, but all remain BLUE.
Figure 3. I select coordinates between (10,10) and (40,40). I would expect all those coordinate to turn RED. The points from the previous selection - those between (10,20) and (30,40) - turn RED. The additional coordinates are BLUE.
Figure 4. I reset the zoom to its original scope and see that

<Auto Generated Inline Image 1.png><Auto Generated Inline Image 2.png><Auto Generated Inline Image 3.png><Auto Generated Inline Image 4.png>

On Saturday, January 23, 2016 at 5:52:00 PM UTC-8, wcopeland wrote:Hi, I recently was introduced to Bokeh and love its potential! I work at a pharma company and have set out to create a bokeh server app for data visualization. I have had a hard time efficiently refreshing plots in the client-side browser. I coded a smaller app that replicates the issue I am having. The app produces a scatter plot using (x,y) coordinates from a data frame. I want to add interactions such that when a user either (1) selects points on the plot or (2) enters the name of comma-separated points in the text box (ie. "p55, p1234"), then those points will turn red on the scatter plot.

I had pasted the code here: http://pastebin.com/JvQ1UpzY

The code that updates the graph can be found in the function refresh_graph().

I have found that if I drill into the ColumnDataSource and change only a few values, they do not get updated in the graph on client-browser (Strategy 1). The only way I can seem to make the client-browser update is if I completely reassign the value to the plot's figure (Strategy 3). However, even though Strategy #3 works, it takes a relatively long time (~500ms vs ~1ms).

Thanks for your help!

On Sunday, January 24, 2016 at 12:03:04 PM UTC-8, Havoc Pennington wrote:
It should, yes:
https://github.com/bokeh/bokeh/blob/master/bokeh/core/property_containers.py#L159

(the dict will be wrapped in that dict proxy class so we see changes)

Havoc

On Jan 24, 2016, at 11:40 AM, Ben Cipollini <[email protected]> wrote:

Havoc,

Would dict.update({'x': array1, 'y': array2}) trigger a single remote update, and avoid having to rebuild a new dict?

Ben

On Sun, Jan 24, 2016 at 8:37 AM, Havoc Pennington <[email protected]> wrote:
Hi,

0.11 is not very smart about updating data - all updates are on the attribute level, which means the change notification is for ColumnDataSource's data attribute, and if anything changes we send the entire `data` again.

In the short term, the most important thing to do is probably to update everything you plan to change about `data` and only then set the new `data`, so the change notification happens once. If you write code like this, there are two notifications:

source.data['x'] = array1 # notifies by resending entire data
source.data['y'] = array2 # notifies again resending entire data again

This is both inefficient and looks ugly (two visible updates to the plot). Better is to first make a whole new dict and then:

source.data = new_data # notify one time

We intercept changes to the dict (setting a whole new column) but we don't wrap the individual data arrays so if you modify them "underneath" the data source, we don't know about the changes I guess. It is possible to do a `source.trigger('data', source.data, source.data)` which should cause things to be sent on the wire by forcing a change notification. Making this work magically might create an unfortunate performance penalty... it may not make sense.

Two PRs that would be great to have in the future:

* support partial updates to dictionaries and lists. We track changes to dicts and lists via https://github.com/bokeh/bokeh/blob/3e2a9737d510abede1be090e4ee9fa6dcab037aa/bokeh/core/property_containers.py and the information that we've appended some elements or modified a single element in-place could be communicated through _notify_owners into properties.py and ultimately result in extra information attached to the ModelChangedEvent emitted by Document. Then the websocket protocol could send a message with only the new or changed elements.

* support binary encoding; rather than inserting the array into the JSON, insert a marker in the JSON that says "byte array <ID> will follow which has type <datatype>", then append to the websocket message said byte array. This should only happen when the JSON is websocket-bound and not when it's going to a file or elsewhere, so it's probably an optional feature of Model.to_json and Document.to_json.

Those two changes should be a nice performance boost in many situations, though of course there are limits (no matter what we do, if a callback changes lots of values in a really large data array, we'll be shoveling a lot of data over the socket). But optimizing append or single-dict patches should be helpful for things like a streaming data source, and binary encoding should be a nice constant-factor improvement in performance.

Havoc

On Sat, Jan 23, 2016 at 8:51 PM, wcopeland <[email protected]> wrote:
Hi, I recently was introduced to Bokeh and love its potential! I work at a pharma company and have set out to create a bokeh server app for data visualization. I have had a hard time efficiently refreshing plots in the client-side browser. I coded a smaller app that replicates the issue I am having. The app produces a scatter plot using (x,y) coordinates from a data frame. I want to add interactions such that when a user either (1) selects points on the plot or (2) enters the name of comma-separated points in the text box (ie. "p55, p1234"), then those points will turn red on the scatter plot.

I had pasted the code here: http://pastebin.com/JvQ1UpzY

The code that updates the graph can be found in the function refresh_graph().

I have found that if I drill into the ColumnDataSource and change only a few values, they do not get updated in the graph on client-browser (Strategy 1). The only way I can seem to make the client-browser update is if I completely reassign the value to the plot's figure (Strategy 3). However, even though Strategy #3 works, it takes a relatively long time (~500ms vs ~1ms).

Thanks for your help!

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/5b2120dc-242d-4f96-b9ec-2f18a3b1250a%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

--
Havoc Pennington
Senior Software Architect

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/CAC%2B_nE0sgAK-rKcsf27O3wA8gQe4SF8m%2BY4Ns%3D2fg0oPQ1avbw%40mail.gmail.com.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/CANgTvZgxK7xv6myVdX6BjyH%2B3mRhPqaGT5bbsrMcOh-PpiUcDg%40mail.gmail.com.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/51684a81-370f-4b86-8a6f-0e19a7de6c29%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
<Auto Generated Inline Image 1.png><Auto Generated Inline Image 2.png><Auto Generated Inline Image 3.png><Auto Generated Inline Image 4.png>

This is both inefficient and looks ugly (two visible updates to the plot). Better is to first make a whole new dict and then:

source.data = new_data # notify one time

We intercept changes to the dict (setting a whole new column) but we don't wrap the individual data arrays so if you modify them "underneath" the data source, we don't know about the changes I guess. It is possible to do a `source.trigger('data', source.data, source.data)` which should cause things to be sent on the wire by forcing a change notification. Making this work magically might create an unfortunate performance penalty... it may not make sense.

I'm guessing it would not but I don't know for sure.

Two PRs that would be great to have in the future:

* support partial updates to dictionaries and lists. We track changes to dicts and lists via https://github.com/bokeh/bokeh/blob/3e2a9737d510abede1be090e4ee9fa6dcab037aa/bokeh/core/property_containers.py and the information that we've appended some elements or modified a single element in-place could be communicated through _notify_owners into properties.py and ultimately result in extra information attached to the ModelChangedEvent emitted by Document. Then the websocket protocol could send a message with only the new or changed elements.

Personal position: I'd really prefer:

* an explicit interface for rolling/appending to column data sources, and

* a new protocol message for incremental updates to column data sources

The reason for the first preference, is because it is always an error to have columns with differing lengths. With a dedicated "streaming" interface, this can be enforced much more easily. That is, users would have to do something like:

  # append to columns foo, bar, dropping earlier points to keep column size at 300 or less
  source.stream(dict(foo=[10, 20], bar=[100, 200]), limit=300)

And then if there are other columns that are mistakenly not updated at the same time but should be, we can raise an exception immediately.

The reason for the second preference is that a new message should be extremely simple to implement and could be tested in isolation. Adding yet more functionality the current "mega message" does not excite me.

* support binary encoding; rather than inserting the array into the JSON, insert a marker in the JSON that says "byte array <ID> will follow which has type <datatype>", then append to the websocket message said byte array. This should only happen when the JSON is websocket-bound and not when it's going to a file or elsewhere, so it's probably an optional feature of Model.to_json and Document.to_json.

I think it would also be worth considering a Base64 binary encoding for standalone (non-server) documents. (reason: true reliable NaN handling in all cases)

Bryan

···

On Jan 24, 2016, at 10:37 AM, Havoc Pennington <[email protected]> wrote:

Those two changes should be a nice performance boost in many situations, though of course there are limits (no matter what we do, if a callback changes lots of values in a really large data array, we'll be shoveling a lot of data over the socket). But optimizing append or single-dict patches should be helpful for things like a streaming data source, and binary encoding should be a nice constant-factor improvement in performance.

Havoc

On Sat, Jan 23, 2016 at 8:51 PM, wcopeland <[email protected]> wrote:
Hi, I recently was introduced to Bokeh and love its potential! I work at a pharma company and have set out to create a bokeh server app for data visualization. I have had a hard time efficiently refreshing plots in the client-side browser. I coded a smaller app that replicates the issue I am having. The app produces a scatter plot using (x,y) coordinates from a data frame. I want to add interactions such that when a user either (1) selects points on the plot or (2) enters the name of comma-separated points in the text box (ie. "p55, p1234"), then those points will turn red on the scatter plot.

I had pasted the code here: http://pastebin.com/JvQ1UpzY

The code that updates the graph can be found in the function refresh_graph().

I have found that if I drill into the ColumnDataSource and change only a few values, they do not get updated in the graph on client-browser (Strategy 1). The only way I can seem to make the client-browser update is if I completely reassign the value to the plot's figure (Strategy 3). However, even though Strategy #3 works, it takes a relatively long time (~500ms vs ~1ms).

Thanks for your help!

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/5b2120dc-242d-4f96-b9ec-2f18a3b1250a%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

--
Havoc Pennington
Senior Software Architect

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/CAC%2B_nE0sgAK-rKcsf27O3wA8gQe4SF8m%2BY4Ns%3D2fg0oPQ1avbw%40mail.gmail.com.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.