Performance drop creating/updating the same plot N times

Hi all,
I just opened an issue on github. It’s related to a performance drop I observe when creating/updating the same plot several times from the same cell - i.e. N runs of the same code from the same cell.
Performances are quite at the beginning good but decrease quickly as the number of runs increases. We would have expected a constant execution time.
Attached is a notebook revealing the problem. I would be grateful if someone could validate the code (might be wrong) and/or reproduce the problem.
Thanks.
N.

How to use the provided notebook:
Run cell #1 & #2. You can then run cell#3 up to MAX_RUN times.
Plots f1 & f2 display the ‘show’ and ‘push_notebook’ execution time history for the last ‘RUN’ runs.

DropInPerformance.ipynb (3.35 KB)

Could be related to this issue.

In our application, an image is acquired line-by-line on a remote system. The full image data is periodically polled from the system (drt) and displayed in a Jupyter notebook. The associated image plot is updated using calls to push_notebook. Note that we don’t use the bokeh server - and don’t want to actually :slight_smile: Everything is asynchronously scheduled using the “Tornado IOLoop approach” - which works smoothly.

The first figure below shows the image refresh performances (prt) obtained during a first acquisition sequence. Refresh time being proportional to the image size, we observe the expected linear increase. At the end of the acquisition, the 1 Mpixels image is updated in approx 1.75s.

Starting a new acquisition sequence, one would expect the refresh time to drop down to the one observed at the beginning of the first sequence - something like 0.15 s. Unfortunately, it doesn’t and remains at the value reached at the end of the first sequence! Here is what we obtain, at the beginning of the second acquisition sequence:

Does anybody see what could explain this behavior?

Thanks.

At the moment the best suggestion I can offer is to re-consider using the Bokeh server, especially in light of the recent demonstration of embedding the server directly on a notebook IOLoop, instead of a separate process:

  https://github.com/bokeh/bokeh/blob/master/examples/howto/server_embed/notebook_embed.ipynb

This approach is currently a bit clunky, but is usable, and the issues there described will be resolved in the next release.

The push_notebook method has always had its limitations, and in particular I have always made a point to mention that it is probably not suitable for continuous streaming. It's possible that it could be made more robust for this use-case but that will require people/time/resources which means that it will not happen in the immediate future due to work for other priorities currently ongoing (or a new contributor interested in working on this immediately)

Bryan

···

On Feb 13, 2017, at 08:34, nicolas.fr <[email protected]> wrote:

In our application, an image is acquired line-by-line on a remote system. The full image data is periodically polled from the system (drt) and displayed in a Jupyter notebook. The associated image plot is updated using calls to push_notebook. Note that we don't use the bokeh server - and don't want to actually :slight_smile: Everything is asynchronously scheduled using the "Tornado IOLoop approach" - which works smoothly.

The first figure below shows the image refresh performances (prt) obtained during a first acquisition sequence. Refresh time being proportional to the image size, we observe the expected linear increase. At the end of the acquisition, the 1 Mpixels image is updated in approx 1.75s.

Starting a new acquisition sequence, one would expect the refresh time to drop down to the one observed at the beginning of the first sequence - something like 0.15 s. Unfortunately, it doesn't and remains at the value reached at the end of the first sequence! Here is what we obtain, at the beginning of the second acquisition sequence:

Does anybody see what could explain this behavior?
Thanks.

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/c7b4978c-3d48-463a-83f5-2bcdc46da363%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Bryan,
That’s the answer I was afraid of.

I could switch our application to a Bokeh server based implementation but I’m afraid of discovering that the problem is still there.

The ‘push_notebook’ approach is really convenient. It provides very natural and straightforward designs. It also fits perfectly with the notebook model.

Anyway, I will have a look to the proposed implementation.

Thanks.

···

Le lundi 13 février 2017 16:16:59 UTC+1, Bryan Van de ven a écrit :

At the moment the best suggestion I can offer is to re-consider using the Bokeh server, especially in light of the recent demonstration of embedding the server directly on a notebook IOLoop, instead of a separate process:

    [https://github.com/bokeh/bokeh/blob/master/examples/howto/server_embed/notebook_embed.ipynb](https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Fbokeh%2Fbokeh%2Fblob%2Fmaster%2Fexamples%2Fhowto%2Fserver_embed%2Fnotebook_embed.ipynb&sa=D&sntz=1&usg=AFQjCNGgMZVuZA6TyXBmQX6HmQLUJa3gVg)

This approach is currently a bit clunky, but is usable, and the issues there described will be resolved in the next release.

The push_notebook method has always had its limitations, and in particular I have always made a point to mention that it is probably not suitable for continuous streaming. It’s possible that it could be made more robust for this use-case but that will require people/time/resources which means that it will not happen in the immediate future due to work for other priorities currently ongoing (or a new contributor interested in working on this immediately)

Bryan

On Feb 13, 2017, at 08:34, nicolas.fr [email protected] wrote:

In our application, an image is acquired line-by-line on a remote system. The full image data is periodically polled from the system (drt) and displayed in a Jupyter notebook. The associated image plot is updated using calls to push_notebook. Note that we don’t use the bokeh server - and don’t want to actually :slight_smile: Everything is asynchronously scheduled using the “Tornado IOLoop approach” - which works smoothly.

The first figure below shows the image refresh performances (prt) obtained during a first acquisition sequence. Refresh time being proportional to the image size, we observe the expected linear increase. At the end of the acquisition, the 1 Mpixels image is updated in approx 1.75s.

Starting a new acquisition sequence, one would expect the refresh time to drop down to the one observed at the beginning of the first sequence - something like 0.15 s. Unfortunately, it doesn’t and remains at the value reached at the end of the first sequence! Here is what we obtain, at the beginning of the second acquisition sequence:

Does anybody see what could explain this behavior?
Thanks.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/c7b4978c-3d48-463a-83f5-2bcdc46da363%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

HI,

FWIW I don't believe the problem will be there with the server. I have an idea of the possible issues with push_notebook, and they have to do with things that are specific to it. The sever machinery is going to be more robust. Improving push_notebook almost certainly means re-implementing it to re-use as much of the server machinery as possible "under the hood".

By way of example, I can report that I have on several occasions now run roughly a dozen bokeh apps simultaneously, including several demanding and continuously streaming apps such as the spectrogram, surface3d and OHLC, for several days at a time without interruption at conference demo tables. There was no degradation in performance over those time periods.

Thanks,

Bryan

···

On Feb 13, 2017, at 09:39, nicolas.fr <[email protected]> wrote:

Bryan,
That's the answer I was afraid of.
I could switch our application to a Bokeh server based implementation but I'm afraid of discovering that the problem is still there.
The 'push_notebook' approach is really convenient. It provides very natural and straightforward designs. It also fits perfectly with the notebook model.
Anyway, I will have a look to the proposed implementation.
Thanks.

Le lundi 13 février 2017 16:16:59 UTC+1, Bryan Van de ven a écrit :
At the moment the best suggestion I can offer is to re-consider using the Bokeh server, especially in light of the recent demonstration of embedding the server directly on a notebook IOLoop, instead of a separate process:

        https://github.com/bokeh/bokeh/blob/master/examples/howto/server_embed/notebook_embed.ipynb

This approach is currently a bit clunky, but is usable, and the issues there described will be resolved in the next release.

The push_notebook method has always had its limitations, and in particular I have always made a point to mention that it is probably not suitable for continuous streaming. It's possible that it could be made more robust for this use-case but that will require people/time/resources which means that it will not happen in the immediate future due to work for other priorities currently ongoing (or a new contributor interested in working on this immediately)

Bryan

> On Feb 13, 2017, at 08:34, nicolas.fr <[email protected]> wrote:
>
> In our application, an image is acquired line-by-line on a remote system. The full image data is periodically polled from the system (drt) and displayed in a Jupyter notebook. The associated image plot is updated using calls to push_notebook. Note that we don't use the bokeh server - and don't want to actually :slight_smile: Everything is asynchronously scheduled using the "Tornado IOLoop approach" - which works smoothly.
>
> The first figure below shows the image refresh performances (prt) obtained during a first acquisition sequence. Refresh time being proportional to the image size, we observe the expected linear increase. At the end of the acquisition, the 1 Mpixels image is updated in approx 1.75s.
>
>
>
>
>
> Starting a new acquisition sequence, one would expect the refresh time to drop down to the one observed at the beginning of the first sequence - something like 0.15 s. Unfortunately, it doesn't and remains at the value reached at the end of the first sequence! Here is what we obtain, at the beginning of the second acquisition sequence:
>
>
>
> Does anybody see what could explain this behavior?
> Thanks.
>
> --
> You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To post to this group, send email to [email protected].
> To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/c7b4978c-3d48-463a-83f5-2bcdc46da363%40continuum.io.
> For more options, visit https://groups.google.com/a/continuum.io/d/optout.

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/5527bc0a-86ac-4516-b3c9-f089edca2914%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Hi,
I guess I can contribute to this issue, although with output as html rather than notebook, as I encountered a very similar problem today.
Consider this code:

import bokeh
bokeh.io.output_file(‘output.html’)
f = bokeh.plotting.figure()
bokeh.io.show(f)

It will generate an empty plot and save it as an HTML.
However, if you repeatedly run it in a notebook, the file size will grow. This seems to be related to renderers not being reset; fortunately, rendering an empty figure yields one warning message:


INFO:bokeh.core.state:Session output file 'output.html' already exists, will be overwritten.
WARNING:/home/erwin/BigData/env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id='0155b85c-20b9-432a-99de-2decd68a5751', ...)

if you run it twice, it will show two warnings, and the output html will have doubled in size:

INFO:bokeh.core.state:Session output file 'output.html' already exists, will be overwritten.
WARNING:/home/erwin/BigData/env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id='0155b85c-20b9-432a-99de-2decd68a5751', ...)
WARNING:/home/erwin/BigData/env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id='0f73e421-22a9-4ad9-9eb6-e69f70d5e95a', ...)

The growth of the file can be reset with this call:
bokeh.io.reset_output()

I do not know if this helps with output_notebook and push_notebook, but I could imagine the root of the problem is similar…
Regards,
Georg

···

Am Montag, 13. Februar 2017 16:52:34 UTC+1 schrieb Bryan Van de ven:

HI,

FWIW I don’t believe the problem will be there with the server. I have an idea of the possible issues with push_notebook, and they have to do with things that are specific to it. The sever machinery is going to be more robust. Improving push_notebook almost certainly means re-implementing it to re-use as much of the server machinery as possible “under the hood”.

By way of example, I can report that I have on several occasions now run roughly a dozen bokeh apps simultaneously, including several demanding and continuously streaming apps such as the spectrogram, surface3d and OHLC, for several days at a time without interruption at conference demo tables. There was no degradation in performance over those time periods.

Thanks,

Bryan

On Feb 13, 2017, at 09:39, nicolas.fr [email protected] wrote:

Bryan,

That’s the answer I was afraid of.
I could switch our application to a Bokeh server based implementation but I’m afraid of discovering that the problem is still there.

The ‘push_notebook’ approach is really convenient. It provides very natural and straightforward designs. It also fits perfectly with the notebook model.
Anyway, I will have a look to the proposed implementation.

Thanks.

Le lundi 13 février 2017 16:16:59 UTC+1, Bryan Van de ven a écrit :

At the moment the best suggestion I can offer is to re-consider using the Bokeh server, especially in light of the recent demonstration of embedding the server directly on a notebook IOLoop, instead of a separate process:

    [https://github.com/bokeh/bokeh/blob/master/examples/howto/server_embed/notebook_embed.ipynb](https://github.com/bokeh/bokeh/blob/master/examples/howto/server_embed/notebook_embed.ipynb)

This approach is currently a bit clunky, but is usable, and the issues there described will be resolved in the next release.

The push_notebook method has always had its limitations, and in particular I have always made a point to mention that it is probably not suitable for continuous streaming. It’s possible that it could be made more robust for this use-case but that will require people/time/resources which means that it will not happen in the immediate future due to work for other priorities currently ongoing (or a new contributor interested in working on this immediately)

Bryan

On Feb 13, 2017, at 08:34, nicolas.fr [email protected] wrote:

In our application, an image is acquired line-by-line on a remote system. The full image data is periodically polled from the system (drt) and displayed in a Jupyter notebook. The associated image plot is updated using calls to push_notebook. Note that we don’t use the bokeh server - and don’t want to actually :slight_smile: Everything is asynchronously scheduled using the “Tornado IOLoop approach” - which works smoothly.

The first figure below shows the image refresh performances (prt) obtained during a first acquisition sequence. Refresh time being proportional to the image size, we observe the expected linear increase. At the end of the acquisition, the 1 Mpixels image is updated in approx 1.75s.

Starting a new acquisition sequence, one would expect the refresh time to drop down to the one observed at the beginning of the first sequence - something like 0.15 s. Unfortunately, it doesn’t and remains at the value reached at the end of the first sequence! Here is what we obtain, at the beginning of the second acquisition sequence:

Does anybody see what could explain this behavior?
Thanks.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/c7b4978c-3d48-463a-83f5-2bcdc46da363%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/5527bc0a-86ac-4516-b3c9-f089edca2914%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Georg,
bokeh.io.reset_output() seems to have a positive effect on the problem we are addressing.

However, in my case, it has a bad side effect. It stops all the streaming activity I have in different cells of the same notebook.

In testing the “embed” approach suggested by Bryan.

Thanks for your contribution.

···

Le lundi 13 février 2017 17:16:25 UTC+1, Georg Pölzlbauer a écrit :


INFO:bokeh.core.state:Session output file 'output.html' already exists, will be overwritten.
WARNING:/home/erwin/BigData/env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id='0155b85c-20b9-432a-99de-2decd68a5751', ...)


INFO:bokeh.core.state:Session output file 'output.html' already exists, will be overwritten.
WARNING:/home/erwin/BigData/env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id='0155b85c-20b9-432a-99de-2decd68a5751', ...)
WARNING:/home/erwin/BigData/env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id='0f73e421-22a9-4ad9-9eb6-e69f70d5e95a', ...)

HI,

FWIW I don’t believe the problem will be there with the server. I have an idea of the possible issues with push_notebook, and they have to do with things that are specific to it. The sever machinery is going to be more robust. Improving push_notebook almost certainly means re-implementing it to re-use as much of the server machinery as possible “under the hood”.

By way of example, I can report that I have on several occasions now run roughly a dozen bokeh apps simultaneously, including several demanding and continuously streaming apps such as the spectrogram, surface3d and OHLC, for several days at a time without interruption at conference demo tables. There was no degradation in performance over those time periods.

Thanks,

Bryan

On Feb 13, 2017, at 09:39, nicolas.fr [email protected] wrote:

Bryan,

That’s the answer I was afraid of.
I could switch our application to a Bokeh server based implementation but I’m afraid of discovering that the problem is still there.

The ‘push_notebook’ approach is really convenient. It provides very natural and straightforward designs. It also fits perfectly with the notebook model.
Anyway, I will have a look to the proposed implementation.

Thanks.

Le lundi 13 février 2017 16:16:59 UTC+1, Bryan Van de ven a écrit :

At the moment the best suggestion I can offer is to re-consider using the Bokeh server, especially in light of the recent demonstration of embedding the server directly on a notebook IOLoop, instead of a separate process:

    [https://github.com/bokeh/bokeh/blob/master/examples/howto/server_embed/notebook_embed.ipynb](https://github.com/bokeh/bokeh/blob/master/examples/howto/server_embed/notebook_embed.ipynb)

This approach is currently a bit clunky, but is usable, and the issues there described will be resolved in the next release.

The push_notebook method has always had its limitations, and in particular I have always made a point to mention that it is probably not suitable for continuous streaming. It’s possible that it could be made more robust for this use-case but that will require people/time/resources which means that it will not happen in the immediate future due to work for other priorities currently ongoing (or a new contributor interested in working on this immediately)

Bryan

On Feb 13, 2017, at 08:34, nicolas.fr [email protected] wrote:

In our application, an image is acquired line-by-line on a remote system. The full image data is periodically polled from the system (drt) and displayed in a Jupyter notebook. The associated image plot is updated using calls to push_notebook. Note that we don’t use the bokeh server - and don’t want to actually :slight_smile: Everything is asynchronously scheduled using the “Tornado IOLoop approach” - which works smoothly.

The first figure below shows the image refresh performances (prt) obtained during a first acquisition sequence. Refresh time being proportional to the image size, we observe the expected linear increase. At the end of the acquisition, the 1 Mpixels image is updated in approx 1.75s.

Starting a new acquisition sequence, one would expect the refresh time to drop down to the one observed at the beginning of the first sequence - something like 0.15 s. Unfortunately, it doesn’t and remains at the value reached at the end of the first sequence! Here is what we obtain, at the beginning of the second acquisition sequence:

Does anybody see what could explain this behavior?
Thanks.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/c7b4978c-3d48-463a-83f5-2bcdc46da363%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/5527bc0a-86ac-4516-b3c9-f089edca2914%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Hi,
I guess I can contribute to this issue, although with output as html rather than notebook, as I encountered a very similar problem today.
Consider this code:

import bokeh
bokeh.io.output_file(‘output.html’)
f = bokeh.plotting.figure()
bokeh.io.show(f)

It will generate an empty plot and save it as an HTML.
However, if you repeatedly run it in a notebook, the file size will grow. This seems to be related to renderers not being reset; fortunately, rendering an empty figure yields one warning message:
if you run it twice, it will show two warnings, and the output html will have doubled in size:

The growth of the file can be reset with this call:
bokeh.io.reset_output()

I do not know if this helps with output_notebook and push_notebook, but I could imagine the root of the problem is similar…
Regards,
Georg

Am Montag, 13. Februar 2017 16:52:34 UTC+1 schrieb Bryan Van de ven:

It's possible (untested) that you could remove old plots from the document roots manually. The push_notebook function works by computing a "JSON diff" in order to send over notebook comms, to update the BokehJS side (which actually causes plots, to update, etc.) Well, if the implicit "current document" keeps continuously accumulating old plots then you can imagine the problem: the JSON diff gets progressively more expensive to perform (even if those plots are no longer actually displayed, they are still in the document). So, you could try calling curdoc().remove_root(...) on plots you are done with and are no longer displaying. That might work, I can't make any promises offhand.

Alternatively, show(...) etc. rely on an implicit document that in order to afford convenience. You could manage Document classes yourself, explicitly, instead. An example is here:

  https://github.com/bokeh/bokeh/blob/master/examples/models/dateaxis.py

If you create a new explicit Document for each plot, then there would be no "accumulation". But you will have to use the Bokeh and notebook cell publishing APIs directly to show your plots, instead of "show". Additionally, you will have to poke around a few internal APIs to reproduce creating a _CommsHandle by hand:

  https://github.com/bokeh/bokeh/blob/master/bokeh/io.py#L338

and then also make sure to call push_notebook with the correct corresponding explicitly created Document appropriately.

Bryan

···

On Feb 13, 2017, at 12:56, nicolas.fr <[email protected]> wrote:

Georg,
bokeh.io.reset_output() seems to have a positive effect on the problem we are addressing.
However, in my case, it has a bad side effect. It stops all the streaming activity I have in different cells of the same notebook.
In testing the "embed" approach suggested by Bryan.
Thanks for your contribution.

Le lundi 13 février 2017 17:16:25 UTC+1, Georg Pölzlbauer a écrit :
Hi,
I guess I can contribute to this issue, although with output as html rather than notebook, as I encountered a very similar problem today.
Consider this code:

import bokeh
bokeh.io.output_file('output.html')
f = bokeh.plotting.figure()
bokeh.io.show(f)

It will generate an empty plot and save it as an HTML.
However, if you repeatedly run it in a notebook, the file size will grow. This seems to be related to renderers not being reset; fortunately, rendering an empty figure yields one warning message:

INFO:bokeh.core.state:Session output file 'output.html' already exists, will be overwritten.
WARNING:/home/erwin/BigData/
env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id='0155b85c-20b9-432a-99de-2decd68a5751', ...)

if you run it twice, it will show two warnings, and the output html will have doubled in size:

INFO:bokeh.core.state:Session output file 'output.html' already exists, will be overwritten.
WARNING:/home/erwin/BigData/
env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id='0155b85c-20b9-432a-
99de-2decd68a5751', ...)
WARNING:/home/erwin/BigData/
env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id='0f73e421-22a9-4ad9-9eb6-e69f70d5e95a', ...)

The growth of the file can be reset with this call:
bokeh.io.reset_output()

I do not know if this helps with output_notebook and push_notebook, but I could imagine the root of the problem is similar...
Regards,
Georg

Am Montag, 13. Februar 2017 16:52:34 UTC+1 schrieb Bryan Van de ven:
HI,

FWIW I don't believe the problem will be there with the server. I have an idea of the possible issues with push_notebook, and they have to do with things that are specific to it. The sever machinery is going to be more robust. Improving push_notebook almost certainly means re-implementing it to re-use as much of the server machinery as possible "under the hood".

By way of example, I can report that I have on several occasions now run roughly a dozen bokeh apps simultaneously, including several demanding and continuously streaming apps such as the spectrogram, surface3d and OHLC, for several days at a time without interruption at conference demo tables. There was no degradation in performance over those time periods.

Thanks,

Bryan

> On Feb 13, 2017, at 09:39, nicolas.fr <[email protected]> wrote:
>
> Bryan,
> That's the answer I was afraid of.
> I could switch our application to a Bokeh server based implementation but I'm afraid of discovering that the problem is still there.
> The 'push_notebook' approach is really convenient. It provides very natural and straightforward designs. It also fits perfectly with the notebook model.
> Anyway, I will have a look to the proposed implementation.
> Thanks.
>
>
> Le lundi 13 février 2017 16:16:59 UTC+1, Bryan Van de ven a écrit :
> At the moment the best suggestion I can offer is to re-consider using the Bokeh server, especially in light of the recent demonstration of embedding the server directly on a notebook IOLoop, instead of a separate process:
>
> https://github.com/bokeh/bokeh/blob/master/examples/howto/server_embed/notebook_embed.ipynb
>
> This approach is currently a bit clunky, but is usable, and the issues there described will be resolved in the next release.
>
> The push_notebook method has always had its limitations, and in particular I have always made a point to mention that it is probably not suitable for continuous streaming. It's possible that it could be made more robust for this use-case but that will require people/time/resources which means that it will not happen in the immediate future due to work for other priorities currently ongoing (or a new contributor interested in working on this immediately)
>
> Bryan
>
> > On Feb 13, 2017, at 08:34, nicolas.fr <[email protected]> wrote:
> >
> > In our application, an image is acquired line-by-line on a remote system. The full image data is periodically polled from the system (drt) and displayed in a Jupyter notebook. The associated image plot is updated using calls to push_notebook. Note that we don't use the bokeh server - and don't want to actually :slight_smile: Everything is asynchronously scheduled using the "Tornado IOLoop approach" - which works smoothly.
> >
> > The first figure below shows the image refresh performances (prt) obtained during a first acquisition sequence. Refresh time being proportional to the image size, we observe the expected linear increase. At the end of the acquisition, the 1 Mpixels image is updated in approx 1.75s.
> >
> >
> >
> >
> >
> > Starting a new acquisition sequence, one would expect the refresh time to drop down to the one observed at the beginning of the first sequence - something like 0.15 s. Unfortunately, it doesn't and remains at the value reached at the end of the first sequence! Here is what we obtain, at the beginning of the second acquisition sequence:
> >
> >
> >
> > Does anybody see what could explain this behavior?
> > Thanks.
> >
> > --
> > You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> > To post to this group, send email to [email protected].
> > To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/c7b4978c-3d48-463a-83f5-2bcdc46da363%40continuum.io.
> > For more options, visit https://groups.google.com/a/continuum.io/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To post to this group, send email to [email protected].
> To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/5527bc0a-86ac-4516-b3c9-f089edca2914%40continuum.io.
> For more options, visit https://groups.google.com/a/continuum.io/d/optout.

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/36a5a1c6-5a20-4bb9-b6b7-0bdf8add95df%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

I really would like to avoid to reach this level of ‘complexity’ but your proposal contains very precious info.
Thanks Bryan.

···

Le mardi 14 février 2017 00:15:46 UTC+1, Bryan Van de ven a écrit :

It’s possible (untested) that you could remove old plots from the document roots manually. The push_notebook function works by computing a “JSON diff” in order to send over notebook comms, to update the BokehJS side (which actually causes plots, to update, etc.) Well, if the implicit “current document” keeps continuously accumulating old plots then you can imagine the problem: the JSON diff gets progressively more expensive to perform (even if those plots are no longer actually displayed, they are still in the document). So, you could try calling curdoc().remove_root(…) on plots you are done with and are no longer displaying. That might work, I can’t make any promises offhand.

Alternatively, show(…) etc. rely on an implicit document that in order to afford convenience. You could manage Document classes yourself, explicitly, instead. An example is here:

    [https://github.com/bokeh/bokeh/blob/master/examples/models/dateaxis.py](https://github.com/bokeh/bokeh/blob/master/examples/models/dateaxis.py)

If you create a new explicit Document for each plot, then there would be no “accumulation”. But you will have to use the Bokeh and notebook cell publishing APIs directly to show your plots, instead of “show”. Additionally, you will have to poke around a few internal APIs to reproduce creating a _CommsHandle by hand:

    [https://github.com/bokeh/bokeh/blob/master/bokeh/io.py#L338](https://github.com/bokeh/bokeh/blob/master/bokeh/io.py#L338)

and then also make sure to call push_notebook with the correct corresponding explicitly created Document appropriately.

Bryan

On Feb 13, 2017, at 12:56, nicolas.fr [email protected] wrote:

Georg,

bokeh.io.reset_output() seems to have a positive effect on the problem we are addressing.

However, in my case, it has a bad side effect. It stops all the streaming activity I have in different cells of the same notebook.

In testing the “embed” approach suggested by Bryan.

Thanks for your contribution.

Le lundi 13 février 2017 17:16:25 UTC+1, Georg Pölzlbauer a écrit :

Hi,
I guess I can contribute to this issue, although with output as html rather than notebook, as I encountered a very similar problem today.
Consider this code:

import bokeh

bokeh.io.output_file(‘output.html’)

f = bokeh.plotting.figure()

bokeh.io.show(f)

It will generate an empty plot and save it as an HTML.
However, if you repeatedly run it in a notebook, the file size will grow. This seems to be related to renderers not being reset; fortunately, rendering an empty figure yields one warning message:

INFO:bokeh.core.state:Session output file ‘output.html’ already exists, will be overwritten.

WARNING:/home/erwin/BigData/

env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id=‘0155b85c-20b9-432a-99de-2decd68a5751’, …)

if you run it twice, it will show two warnings, and the output html will have doubled in size:

INFO:bokeh.core.state:Session output file ‘output.html’ already exists, will be overwritten.

WARNING:/home/erwin/BigData/

env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id='0155b85c-20b9-432a-

99de-2decd68a5751’, …)

WARNING:/home/erwin/BigData/

env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id=‘0f73e421-22a9-4ad9-9eb6-e69f70d5e95a’, …)

The growth of the file can be reset with this call:

bokeh.io.reset_output()

I do not know if this helps with output_notebook and push_notebook, but I could imagine the root of the problem is similar…

Regards,
Georg

Am Montag, 13. Februar 2017 16:52:34 UTC+1 schrieb Bryan Van de ven:

HI,

FWIW I don’t believe the problem will be there with the server. I have an idea of the possible issues with push_notebook, and they have to do with things that are specific to it. The sever machinery is going to be more robust. Improving push_notebook almost certainly means re-implementing it to re-use as much of the server machinery as possible “under the hood”.

By way of example, I can report that I have on several occasions now run roughly a dozen bokeh apps simultaneously, including several demanding and continuously streaming apps such as the spectrogram, surface3d and OHLC, for several days at a time without interruption at conference demo tables. There was no degradation in performance over those time periods.

Thanks,

Bryan

On Feb 13, 2017, at 09:39, nicolas.fr [email protected] wrote:

Bryan,
That’s the answer I was afraid of.
I could switch our application to a Bokeh server based implementation but I’m afraid of discovering that the problem is still there.
The ‘push_notebook’ approach is really convenient. It provides very natural and straightforward designs. It also fits perfectly with the notebook model.
Anyway, I will have a look to the proposed implementation.
Thanks.

Le lundi 13 février 2017 16:16:59 UTC+1, Bryan Van de ven a écrit :
At the moment the best suggestion I can offer is to re-consider using the Bokeh server, especially in light of the recent demonstration of embedding the server directly on a notebook IOLoop, instead of a separate process:

    [https://github.com/bokeh/bokeh/blob/master/examples/howto/server_embed/notebook_embed.ipynb](https://github.com/bokeh/bokeh/blob/master/examples/howto/server_embed/notebook_embed.ipynb)

This approach is currently a bit clunky, but is usable, and the issues there described will be resolved in the next release.

The push_notebook method has always had its limitations, and in particular I have always made a point to mention that it is probably not suitable for continuous streaming. It’s possible that it could be made more robust for this use-case but that will require people/time/resources which means that it will not happen in the immediate future due to work for other priorities currently ongoing (or a new contributor interested in working on this immediately)

Bryan

On Feb 13, 2017, at 08:34, nicolas.fr [email protected] wrote:

In our application, an image is acquired line-by-line on a remote system. The full image data is periodically polled from the system (drt) and displayed in a Jupyter notebook. The associated image plot is updated using calls to push_notebook. Note that we don’t use the bokeh server - and don’t want to actually :slight_smile: Everything is asynchronously scheduled using the “Tornado IOLoop approach” - which works smoothly.

The first figure below shows the image refresh performances (prt) obtained during a first acquisition sequence. Refresh time being proportional to the image size, we observe the expected linear increase. At the end of the acquisition, the 1 Mpixels image is updated in approx 1.75s.

Starting a new acquisition sequence, one would expect the refresh time to drop down to the one observed at the beginning of the first sequence - something like 0.15 s. Unfortunately, it doesn’t and remains at the value reached at the end of the first sequence! Here is what we obtain, at the beginning of the second acquisition sequence:

Does anybody see what could explain this behavior?
Thanks.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/c7b4978c-3d48-463a-83f5-2bcdc46da363%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/5527bc0a-86ac-4516-b3c9-f089edca2914%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/36a5a1c6-5a20-4bb9-b6b7-0bdf8add95df%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Bryan,
wow, the remove_root() approach works, I tested with Nicolas’ notebook code, this made the update time come down to millisecs and made updates constant for both plot and data source refreshes.

Just out of curiosity:

  1. you mentioned this was “untested”, is removing from the document not something canonical to do?
  2. I do not wholly understand the “document” concept and what documents actually contain apart from the one root document; could you suggest any particular sources or pages where I can look this up?
    Thanks,
    Georg
···

Am Dienstag, 14. Februar 2017 00:15:46 UTC+1 schrieb Bryan Van de ven:

It’s possible (untested) that you could remove old plots from the document roots manually. The push_notebook function works by computing a “JSON diff” in order to send over notebook comms, to update the BokehJS side (which actually causes plots, to update, etc.) Well, if the implicit “current document” keeps continuously accumulating old plots then you can imagine the problem: the JSON diff gets progressively more expensive to perform (even if those plots are no longer actually displayed, they are still in the document). So, you could try calling curdoc().remove_root(…) on plots you are done with and are no longer displaying. That might work, I can’t make any promises offhand.

Alternatively, show(…) etc. rely on an implicit document that in order to afford convenience. You could manage Document classes yourself, explicitly, instead. An example is here:

    [https://github.com/bokeh/bokeh/blob/master/examples/models/dateaxis.py](https://github.com/bokeh/bokeh/blob/master/examples/models/dateaxis.py)

If you create a new explicit Document for each plot, then there would be no “accumulation”. But you will have to use the Bokeh and notebook cell publishing APIs directly to show your plots, instead of “show”. Additionally, you will have to poke around a few internal APIs to reproduce creating a _CommsHandle by hand:

    [https://github.com/bokeh/bokeh/blob/master/bokeh/io.py#L338](https://github.com/bokeh/bokeh/blob/master/bokeh/io.py#L338)

and then also make sure to call push_notebook with the correct corresponding explicitly created Document appropriately.

Bryan

On Feb 13, 2017, at 12:56, nicolas.fr [email protected] wrote:

Georg,

bokeh.io.reset_output() seems to have a positive effect on the problem we are addressing.

However, in my case, it has a bad side effect. It stops all the streaming activity I have in different cells of the same notebook.

In testing the “embed” approach suggested by Bryan.

Thanks for your contribution.

Le lundi 13 février 2017 17:16:25 UTC+1, Georg Pölzlbauer a écrit :

Hi,
I guess I can contribute to this issue, although with output as html rather than notebook, as I encountered a very similar problem today.
Consider this code:

import bokeh

bokeh.io.output_file(‘output.html’)

f = bokeh.plotting.figure()

bokeh.io.show(f)

It will generate an empty plot and save it as an HTML.
However, if you repeatedly run it in a notebook, the file size will grow. This seems to be related to renderers not being reset; fortunately, rendering an empty figure yields one warning message:

INFO:bokeh.core.state:Session output file ‘output.html’ already exists, will be overwritten.

WARNING:/home/erwin/BigData/

env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id=‘0155b85c-20b9-432a-99de-2decd68a5751’, …)

if you run it twice, it will show two warnings, and the output html will have doubled in size:

INFO:bokeh.core.state:Session output file ‘output.html’ already exists, will be overwritten.

WARNING:/home/erwin/BigData/

env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id='0155b85c-20b9-432a-

99de-2decd68a5751’, …)

WARNING:/home/erwin/BigData/

env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id=‘0f73e421-22a9-4ad9-9eb6-e69f70d5e95a’, …)

The growth of the file can be reset with this call:

bokeh.io.reset_output()

I do not know if this helps with output_notebook and push_notebook, but I could imagine the root of the problem is similar…

Regards,
Georg

Am Montag, 13. Februar 2017 16:52:34 UTC+1 schrieb Bryan Van de ven:

HI,

FWIW I don’t believe the problem will be there with the server. I have an idea of the possible issues with push_notebook, and they have to do with things that are specific to it. The sever machinery is going to be more robust. Improving push_notebook almost certainly means re-implementing it to re-use as much of the server machinery as possible “under the hood”.

By way of example, I can report that I have on several occasions now run roughly a dozen bokeh apps simultaneously, including several demanding and continuously streaming apps such as the spectrogram, surface3d and OHLC, for several days at a time without interruption at conference demo tables. There was no degradation in performance over those time periods.

Thanks,

Bryan

On Feb 13, 2017, at 09:39, nicolas.fr [email protected] wrote:

Bryan,
That’s the answer I was afraid of.
I could switch our application to a Bokeh server based implementation but I’m afraid of discovering that the problem is still there.
The ‘push_notebook’ approach is really convenient. It provides very natural and straightforward designs. It also fits perfectly with the notebook model.
Anyway, I will have a look to the proposed implementation.
Thanks.

Le lundi 13 février 2017 16:16:59 UTC+1, Bryan Van de ven a écrit :
At the moment the best suggestion I can offer is to re-consider using the Bokeh server, especially in light of the recent demonstration of embedding the server directly on a notebook IOLoop, instead of a separate process:

    [https://github.com/bokeh/bokeh/blob/master/examples/howto/server_embed/notebook_embed.ipynb](https://github.com/bokeh/bokeh/blob/master/examples/howto/server_embed/notebook_embed.ipynb)

This approach is currently a bit clunky, but is usable, and the issues there described will be resolved in the next release.

The push_notebook method has always had its limitations, and in particular I have always made a point to mention that it is probably not suitable for continuous streaming. It’s possible that it could be made more robust for this use-case but that will require people/time/resources which means that it will not happen in the immediate future due to work for other priorities currently ongoing (or a new contributor interested in working on this immediately)

Bryan

On Feb 13, 2017, at 08:34, nicolas.fr [email protected] wrote:

In our application, an image is acquired line-by-line on a remote system. The full image data is periodically polled from the system (drt) and displayed in a Jupyter notebook. The associated image plot is updated using calls to push_notebook. Note that we don’t use the bokeh server - and don’t want to actually :slight_smile: Everything is asynchronously scheduled using the “Tornado IOLoop approach” - which works smoothly.

The first figure below shows the image refresh performances (prt) obtained during a first acquisition sequence. Refresh time being proportional to the image size, we observe the expected linear increase. At the end of the acquisition, the 1 Mpixels image is updated in approx 1.75s.

Starting a new acquisition sequence, one would expect the refresh time to drop down to the one observed at the beginning of the first sequence - something like 0.15 s. Unfortunately, it doesn’t and remains at the value reached at the end of the first sequence! Here is what we obtain, at the beginning of the second acquisition sequence:

Does anybody see what could explain this behavior?
Thanks.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]m.io.
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/c7b4978c-3d48-463a-83f5-2bcdc46da363%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/5527bc0a-86ac-4516-b3c9-f089edca2914%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/36a5a1c6-5a20-4bb9-b6b7-0bdf8add95df%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

I'm glad that works!

By "untested" I mean: there are unit tests for remove_root, but I've never had a situation where I needed to try using it, personally. It exists for a a reason, but it's not a commonly used function.

Document is a collection of Bokeh models, and in particular it is the smallest meaningful unit of serialization. Because Models may refer to other models (e.g. Plots *have* Axes which *have* Tickers...) it's only guaranteed to be meaningful to serialize a Document in its entirety (Bokeh will make sure to automatically collect any Models reachable
from any of the "roots")

There's a new notional diagram and partially updated docstrings in the ref guide:

  http://bokeh.pydata.org/en/latest/docs/reference/document.html

There's also some recently updated information and new diagrams here as well:

  http://bokeh.pydata.org/en/latest/docs/user_guide/server.html

Thanks,

Bryan

···

On Feb 14, 2017, at 07:20, Georg Pölzlbauer <[email protected]> wrote:

Bryan,
wow, the remove_root() approach works, I tested with Nicolas' notebook code, this made the update time come down to millisecs and made updates constant for both plot and data source refreshes.

Just out of curiosity:
1. you mentioned this was "untested", is removing from the document not something canonical to do?
2. I do not wholly understand the "document" concept and what documents actually contain apart from the one root document; could you suggest any particular sources or pages where I can look this up?
Thanks,
Georg

Am Dienstag, 14. Februar 2017 00:15:46 UTC+1 schrieb Bryan Van de ven:
It's possible (untested) that you could remove old plots from the document roots manually. The push_notebook function works by computing a "JSON diff" in order to send over notebook comms, to update the BokehJS side (which actually causes plots, to update, etc.) Well, if the implicit "current document" keeps continuously accumulating old plots then you can imagine the problem: the JSON diff gets progressively more expensive to perform (even if those plots are no longer actually displayed, they are still in the document). So, you could try calling curdoc().remove_root(...) on plots you are done with and are no longer displaying. That might work, I can't make any promises offhand.

Alternatively, show(...) etc. rely on an implicit document that in order to afford convenience. You could manage Document classes yourself, explicitly, instead. An example is here:

        https://github.com/bokeh/bokeh/blob/master/examples/models/dateaxis.py

If you create a new explicit Document for each plot, then there would be no "accumulation". But you will have to use the Bokeh and notebook cell publishing APIs directly to show your plots, instead of "show". Additionally, you will have to poke around a few internal APIs to reproduce creating a _CommsHandle by hand:

        https://github.com/bokeh/bokeh/blob/master/bokeh/io.py#L338

and then also make sure to call push_notebook with the correct corresponding explicitly created Document appropriately.

Bryan

> On Feb 13, 2017, at 12:56, nicolas.fr <[email protected]> wrote:
>
> Georg,
> bokeh.io.reset_output() seems to have a positive effect on the problem we are addressing.
> However, in my case, it has a bad side effect. It stops all the streaming activity I have in different cells of the same notebook.
> In testing the "embed" approach suggested by Bryan.
> Thanks for your contribution.
>
>
>
> Le lundi 13 février 2017 17:16:25 UTC+1, Georg Pölzlbauer a écrit :
> Hi,
> I guess I can contribute to this issue, although with output as html rather than notebook, as I encountered a very similar problem today.
> Consider this code:
>
> import bokeh
> bokeh.io.output_file('output.html')
> f = bokeh.plotting.figure()
> bokeh.io.show(f)
>
> It will generate an empty plot and save it as an HTML.
> However, if you repeatedly run it in a notebook, the file size will grow. This seems to be related to renderers not being reset; fortunately, rendering an empty figure yields one warning message:
>
> INFO:bokeh.core.state:Session output file 'output.html' already exists, will be overwritten.
> WARNING:/home/erwin/BigData/
> env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id='0155b85c-20b9-432a-99de-2decd68a5751', ...)
>
> if you run it twice, it will show two warnings, and the output html will have doubled in size:
>
> INFO:bokeh.core.state:Session output file 'output.html' already exists, will be overwritten.
> WARNING:/home/erwin/BigData/
> env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id='0155b85c-20b9-432a-
> 99de-2decd68a5751', ...)
> WARNING:/home/erwin/BigData/
> env_PA_py3/lib/python3.5/site-packages/bokeh/core/validation/check.py:W-1001 (NO_DATA_RENDERERS): Plot has no data renderers: Figure(id='0f73e421-22a9-4ad9-9eb6-e69f70d5e95a', ...)
>
> The growth of the file can be reset with this call:
> bokeh.io.reset_output()
>
> I do not know if this helps with output_notebook and push_notebook, but I could imagine the root of the problem is similar...
> Regards,
> Georg
>
>
>
> Am Montag, 13. Februar 2017 16:52:34 UTC+1 schrieb Bryan Van de ven:
> HI,
>
> FWIW I don't believe the problem will be there with the server. I have an idea of the possible issues with push_notebook, and they have to do with things that are specific to it. The sever machinery is going to be more robust. Improving push_notebook almost certainly means re-implementing it to re-use as much of the server machinery as possible "under the hood".
>
> By way of example, I can report that I have on several occasions now run roughly a dozen bokeh apps simultaneously, including several demanding and continuously streaming apps such as the spectrogram, surface3d and OHLC, for several days at a time without interruption at conference demo tables. There was no degradation in performance over those time periods.
>
> Thanks,
>
> Bryan
>
> > On Feb 13, 2017, at 09:39, nicolas.fr <[email protected]> wrote:
> >
> > Bryan,
> > That's the answer I was afraid of.
> > I could switch our application to a Bokeh server based implementation but I'm afraid of discovering that the problem is still there.
> > The 'push_notebook' approach is really convenient. It provides very natural and straightforward designs. It also fits perfectly with the notebook model.
> > Anyway, I will have a look to the proposed implementation.
> > Thanks.
> >
> >
> > Le lundi 13 février 2017 16:16:59 UTC+1, Bryan Van de ven a écrit :
> > At the moment the best suggestion I can offer is to re-consider using the Bokeh server, especially in light of the recent demonstration of embedding the server directly on a notebook IOLoop, instead of a separate process:
> >
> > https://github.com/bokeh/bokeh/blob/master/examples/howto/server_embed/notebook_embed.ipynb
> >
> > This approach is currently a bit clunky, but is usable, and the issues there described will be resolved in the next release.
> >
> > The push_notebook method has always had its limitations, and in particular I have always made a point to mention that it is probably not suitable for continuous streaming. It's possible that it could be made more robust for this use-case but that will require people/time/resources which means that it will not happen in the immediate future due to work for other priorities currently ongoing (or a new contributor interested in working on this immediately)
> >
> > Bryan
> >
> > > On Feb 13, 2017, at 08:34, nicolas.fr <[email protected]> wrote:
> > >
> > > In our application, an image is acquired line-by-line on a remote system. The full image data is periodically polled from the system (drt) and displayed in a Jupyter notebook. The associated image plot is updated using calls to push_notebook. Note that we don't use the bokeh server - and don't want to actually :slight_smile: Everything is asynchronously scheduled using the "Tornado IOLoop approach" - which works smoothly.
> > >
> > > The first figure below shows the image refresh performances (prt) obtained during a first acquisition sequence. Refresh time being proportional to the image size, we observe the expected linear increase. At the end of the acquisition, the 1 Mpixels image is updated in approx 1.75s.
> > >
> > >
> > >
> > >
> > >
> > > Starting a new acquisition sequence, one would expect the refresh time to drop down to the one observed at the beginning of the first sequence - something like 0.15 s. Unfortunately, it doesn't and remains at the value reached at the end of the first sequence! Here is what we obtain, at the beginning of the second acquisition sequence:
> > >
> > >
> > >
> > > Does anybody see what could explain this behavior?
> > > Thanks.
> > >
> > > --
> > > You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
> > > To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> > > To post to this group, send email to [email protected].
> > > To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/c7b4978c-3d48-463a-83f5-2bcdc46da363%40continuum.io.
> > > For more options, visit https://groups.google.com/a/continuum.io/d/optout.
> >
> >
> > --
> > You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> > To post to this group, send email to [email protected].
> > To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/5527bc0a-86ac-4516-b3c9-f089edca2914%40continuum.io.
> > For more options, visit https://groups.google.com/a/continuum.io/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To post to this group, send email to [email protected].
> To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/36a5a1c6-5a20-4bb9-b6b7-0bdf8add95df%40continuum.io.
> For more options, visit https://groups.google.com/a/continuum.io/d/optout.

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/628cfeaf-e6e1-4997-b9d1-bf0f9267f621%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Hi all,

Attached is a simulator of our acquisition system. It only depends on standard modules so anyone should be able to test it.

As suggested by Bryan, the data streaming implementation is based on an embedded Bokeh server. The good news, is that the performance problem - which was at the origin of this post - seems to be solved. So it’s worth the effort to switch to the embedded server. Thanks for that Bryan.

However, I have (at least) one remaining problem with the new implementation. It’s related to a “too many open files” error I got after several restart of the server. Googling the web, it seems to be due to the fact that the connection between the browser and tornado is not properly closed. The error is tedious but simple to reproduce with the attached notebook:

Step-0: execute the 2nd cell of the notebook - you should obtain the following interface:

Step-1: click on the start (play icon) button to load the server and start the data stream

Step-2: (optionally) wait for some data to be delivered

Step-3: click on the close (cross icon) button to close everything and ‘destroy’ the server

Repeat the Step-1 -> Step-3 sequence till the ‘too many open files’ error is raised (approx. 15 times on my system):

`INFO:bokeh.server.server:Starting Bokeh server version 0.12.4

WARNING:bokeh.server.server:Host wildcard ‘’ can expose the application to HTTP host header attacks. Host wildcard should only be used for testing purpose. WARNING:bokeh.server.server:Host wildcard '’ can expose the application to HTTP host header attacks. Host wildcard should only be used for testing purpose.

ERROR:root:Internal Python error in the inspect module. Below is the traceback from this internal error.

INFO:root: Unfortunately, your original traceback can not be constructed.

Bokeh output already redirected to Jupyter notebook
starting Bokeh server…
Traceback (most recent call last):
File “/Volumes/MacHD/anaconda/lib/python3.5/site-packages/ipywidgets/widgets/widget.py”, line 62, in call
local_value = callback(args, **kwargs)
File “”, line 390, in __on_start_clicked
self._data_streamer.start()
File “”, line 188, in start
self.__start_bokeh_server()
File “”, line 236, in __start_bokeh_server
allow_websocket_origin=[’
’]
File “/Volumes/MacHD/anaconda/lib/python3.5/site-packages/bokeh/server/server.py”, line 143, in init
File “/Volumes/MacHD/anaconda/lib/python3.5/site-packages/bokeh/server/tornado.py”, line 265, in initialize
File “/Volumes/MacHD/anaconda/lib/python3.5/concurrent/futures/process.py”, line 395, in init
File “/Volumes/MacHD/anaconda/lib/python3.5/multiprocessing/context.py”, line 111, in SimpleQueue
File “/Volumes/MacHD/anaconda/lib/python3.5/multiprocessing/queues.py”, line 329, in init
File “/Volumes/MacHD/anaconda/lib/python3.5/multiprocessing/context.py”, line 66, in Lock
File “/Volumes/MacHD/anaconda/lib/python3.5/multiprocessing/synchronize.py”, line 163, in init
File “/Volumes/MacHD/anaconda/lib/python3.5/multiprocessing/synchronize.py”, line 60, in init
OSError: [Errno 24] Too many open files

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/Volumes/MacHD/anaconda/lib/python3.5/site-packages/IPython/core/interactiveshell.py”, line 1821, inshowtraceback
stb = value.render_traceback()
AttributeError: ‘OSError’ object has no attribute ‘render_traceback

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/Volumes/MacHD/anaconda/lib/python3.5/site-packages/IPython/core/ultratb.py”, line 1132, in get_records
File “/Volumes/MacHD/anaconda/lib/python3.5/site-packages/IPython/core/ultratb.py”, line 313, in wrapped
File “/Volumes/MacHD/anaconda/lib/python3.5/site-packages/IPython/core/ultratb.py”, line 358, in_fixed_getinnerframes
File “/Volumes/MacHD/anaconda/lib/python3.5/inspect.py”, line 1453, in getinnerframes
File “/Volumes/MacHD/anaconda/lib/python3.5/inspect.py”, line 1410, in getframeinfo
File “/Volumes/MacHD/anaconda/lib/python3.5/inspect.py”, line 672, in getsourcefile
File “/Volumes/MacHD/anaconda/lib/python3.5/inspect.py”, line 701, in getmodule
File “/Volumes/MacHD/anaconda/lib/python3.5/inspect.py”, line 685, in getabsfile
File “/Volumes/MacHD/anaconda/lib/python3.5/posixpath.py”, line 361, in abspath
OSError: [Errno 24] Too many open files
`


BTW, is it possible to disable the tornado log? It 'pollutes' the cell with messages which are a bit meaningless for our end users.

Thanks.

EmbedBokehServer.ipynb (19.6 KB)

Hi,

Great simulator!

I ran it many times even playing with the 3 cells at the same time and I couldn’t get any drop in performance (with up and down values between 20 and 40ms for INFO:tornado.access:200 GET /autoload…) or stop for too many connexions.

Once I had an error with the widgets of the first cell duplicated on two lines, but couldn’t reproduce it.

Here is my conf:

Laptop 8GB RAM

Platform: Linux-4.8.0-37-generic-x86_64-with-debian-stretch-sid
Python: 3.5.2 |Anaconda custom (64-bit)| (default, Jul 2 2016, 17:53:06) Bokeh: 0.12.4

Hope it helps.

···

On Monday, February 6, 2017 at 7:59:28 PM UTC+1, nicolas.fr wrote:

Hi all,
I just opened an issue on github. It’s related to a performance drop I observe when creating/updating the same plot several times from the same cell - i.e. N runs of the same code from the same cell.
Performances are quite at the beginning good but decrease quickly as the number of runs increases. We would have expected a constant execution time.
Attached is a notebook revealing the problem. I would be grateful if someone could validate the code (might be wrong) and/or reproduce the problem.
Thanks.
N.

How to use the provided notebook:
Run cell #1 & #2. You can then run cell#3 up to MAX_RUN times.
Plots f1 & f2 display the ‘show’ and ‘push_notebook’ execution time history for the last ‘RUN’ runs.

Thanks for testing the notebook.

ipywidgets have a known bug generating a warning in case you try to execute anything related to them before the kernel startup completes.

Might be what you observed.

···

Le mardi 14 février 2017 23:04:53 UTC+1, alEx S a écrit :

Once I had an error with the widgets of the first cell duplicated on two lines, but couldn’t reproduce it.

@Georg,

Would you mind posting your mods of the DropInPerformance notebook.

Thanks.

Nicolas

Regarding logging, Bokeh uses standard python logger module things and nothing fancy. Options include setting the log level to "critical", or redirecting the log elsewhere using standard python logging means.

Thanks,

Bryan

···

On Feb 14, 2017, at 15:11, nicolas.fr <[email protected]> wrote:

Hi all,

Attached is a simulator of our acquisition system. It only depends on standard modules so anyone should be able to test it.

As suggested by Bryan, the data streaming implementation is based on an embedded Bokeh server. The good news, is that the performance problem - which was at the origin of this post - seems to be solved. So it's worth the effort to switch to the embedded server. Thanks for that Bryan.

However, I have (at least) one remaining problem with the new implementation. It's related to a "too many open files" error I got after several restart of the server. Googling the web, it seems to be due to the fact that the connection between the browser and tornado is not properly closed. The error is tedious but simple to reproduce with the attached notebook:

Step-0: execute the 2nd cell of the notebook - you should obtain the following interface:

Step-1: click on the start (play icon) button to load the server and start the data stream
Step-2: (optionally) wait for some data to be delivered
Step-3: click on the close (cross icon) button to close everything and 'destroy' the server

Repeat the Step-1 -> Step-3 sequence till the 'too many open files' error is raised (approx. 15 times on my system):

INFO:bokeh.server.server:Starting Bokeh server version 0.12.4

WARNING:bokeh.server.server:Host wildcard '*' can expose the application to HTTP host header attacks. Host wildcard should only be used for testing purpose. WARNING:bokeh.server.server:Host wildcard '*' can expose the application to HTTP host header attacks. Host wildcard should only be used for testing purpose.

ERROR:root:Internal Python error in the inspect module. Below is the traceback from this internal error.

INFO:root: Unfortunately, your original traceback can not be constructed.

Bokeh output already redirected to Jupyter notebook
starting Bokeh server...
Traceback (most recent call last):
File "/Volumes/MacHD/anaconda/lib/python3.5/site-packages/ipywidgets/widgets/widget.py", line 62, in __call__
local_value = callback(*args, **kwargs)
File "<ipython-input-1-7389cd4ee693>", line 390, in __on_start_clicked
self._data_streamer.start()
File "<ipython-input-1-7389cd4ee693>", line 188, in start
self.__start_bokeh_server()
File "<ipython-input-1-7389cd4ee693>", line 236, in __start_bokeh_server
allow_websocket_origin=['*']
File "/Volumes/MacHD/anaconda/lib/python3.5/site-packages/bokeh/server/server.py", line 143, in __init__
File "/Volumes/MacHD/anaconda/lib/python3.5/site-packages/bokeh/server/tornado.py", line 265, in initialize
File "/Volumes/MacHD/anaconda/lib/python3.5/concurrent/futures/process.py", line 395, in __init__
File "/Volumes/MacHD/anaconda/lib/python3.5/multiprocessing/context.py", line 111, in SimpleQueue
File "/Volumes/MacHD/anaconda/lib/python3.5/multiprocessing/queues.py", line 329, in __init__
File "/Volumes/MacHD/anaconda/lib/python3.5/multiprocessing/context.py", line 66, in Lock
File "/Volumes/MacHD/anaconda/lib/python3.5/multiprocessing/synchronize.py", line 163, in __init__
File "/Volumes/MacHD/anaconda/lib/python3.5/multiprocessing/synchronize.py", line 60, in __init__
OSError: [Errno 24] Too many open files

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Volumes/MacHD/anaconda/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 1821, inshowtraceback
stb = value._render_traceback_()
AttributeError: 'OSError' object has no attribute '_render_traceback_'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Volumes/MacHD/anaconda/lib/python3.5/site-packages/IPython/core/ultratb.py", line 1132, in get_records
File "/Volumes/MacHD/anaconda/lib/python3.5/site-packages/IPython/core/ultratb.py", line 313, in wrapped
File "/Volumes/MacHD/anaconda/lib/python3.5/site-packages/IPython/core/ultratb.py", line 358, in_fixed_getinnerframes
File "/Volumes/MacHD/anaconda/lib/python3.5/inspect.py", line 1453, in getinnerframes
File "/Volumes/MacHD/anaconda/lib/python3.5/inspect.py", line 1410, in getframeinfo
File "/Volumes/MacHD/anaconda/lib/python3.5/inspect.py", line 672, in getsourcefile
File "/Volumes/MacHD/anaconda/lib/python3.5/inspect.py", line 701, in getmodule
File "/Volumes/MacHD/anaconda/lib/python3.5/inspect.py", line 685, in getabsfile
File "/Volumes/MacHD/anaconda/lib/python3.5/posixpath.py", line 361, in abspath
OSError: [Errno 24] Too many open files

BTW, is it possible to disable the tornado log? It 'pollutes' the cell with messages which are a bit meaningless for our end users.

Thanks.

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/c34ca74f-1917-4a06-b0ca-75cbd28ca8bd%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
<EmbedBokehServer.ipynb>