Live Plotting Speed Issues: How can I choose when to update my live plots?

Hey,

I am trying to have a gridplot with potentially a large number of subplots. Each of these subplots again contains several lines. Altogether I assume this may be considered as “a lot of data” that needs to be transferred.

Each line is updated periodically and the changes are displayed in the respective subplot. From what I understand, each line is updated once its data is changed. This repeated synchronisation I assume to be the source of speed issues. My question therefore is: Is there a way to prevent the updates being synced with the plot until I explicitly give a command?

The following code is an example demonstrating the issue. On my laptop, each cycle takes in-between 0.25 to 0.4 seconds.

import numpy as np
import time

from bokeh.io import gridplot
from bokeh.client import push_session
from bokeh.plotting import figure, curdoc

"""
Plots a bunch of dancing gaussians.

Currently, it's a bit slow and some of the plots are only displaying after refreshing the browser
"""

def demo_online_gaussians_v1(n_points = 100, n_gaussians = 10, n_plots = 12, (x_min, x_max) = (-5, 5)):

    #### Destination Setup ####
    session = push_session(curdoc(),session_id="Test1")

    #### Plot Setup ####
    x = np.linspace(x_min, x_max, n_points)
    offsets = np.arange(n_gaussians)*2*np.pi/(n_gaussians+1)

    axs = [figure(width = 300, height=300) for _ in xrange(n_plots)]
    fig = gridplot([axs[0:4], axs[4:8], axs[8:12]]) # changed to accumulate all n_plots

    lines_plots = {}

    #### Show the plots with initial content ####
    session.show()

    #### update plots in loop ####
    for t in np.linspace(0, 50, 300):
        means = np.sin(t+offsets)
        stds = 1
        data = np.exp(-(x[:, None] - means)**2 / 2*stds**2) / np.sqrt(2*np.pi*stds**2)

        t_start = time.time()

        for ax in axs:
            # Get new data
            # Plot
            if t==0:
                lines_plots[ax] = [ax.line(x, data[:, i], line_width=4) for i in xrange(n_gaussians)]
            else:
                for i, r in enumerate(lines_plots[ax]):
                    r.data_source.data["y"] = data[:, i]

        print 'Time to update plot: %s' % (time.time() - t_start)
        time.sleep(1.0)

if __name__ == '__main__':
    demo_online_gaussians_v1(n_plots = 12,n_gaussians = 10)

One question: do you need to update all the data in the plots every update, or is it actually a streaming use case? (That is, do you really just need to add new data points to what is there?) If so, the first thing to try is the streaming API, which you can see shown here:

  https://github.com/bokeh/bokeh/tree/master/examples/app/ohlc

If you do need to update all the data, all the time, it might be time to look into some sort of "batching" API.

Bryan

···

On Apr 6, 2016, at 9:01 AM, Mat <[email protected]> wrote:

Hey,

I am trying to have a gridplot with potentially a large number of subplots. Each of these subplots again contains several lines. Altogether I assume this may be considered as "a lot of data" that needs to be transferred.
Each line is updated periodically and the changes are displayed in the respective subplot. From what I understand, each line is updated once its data is changed. This repeated synchronisation I assume to be the source of speed issues. My question therefore is: Is there a way to prevent the updates being synced with the plot until I explicitly give a command?

The following code is an example demonstrating the issue. On my laptop, each cycle takes in-between 0.25 to 0.4 seconds.

import numpy as np
import time

from bokeh.io import gridplot
from bokeh.client import push_session
from bokeh.plotting import figure, curdoc

"""
Plots a bunch of dancing gaussians.

Currently, it's a bit slow and some of the plots are only displaying after refreshing the browser
"""

def demo_online_gaussians_v1(n_points = 100, n_gaussians = 10, n_plots = 12, (x_min, x_max) = (-5, 5)):

    #### Destination Setup ####
    session = push_session(curdoc(),session_id="Test1")

    #### Plot Setup ####
    x = np.linspace(x_min, x_max, n_points)
    offsets = np.arange(n_gaussians)*2*np.pi/(n_gaussians+1)

    axs = [figure(width = 300, height=300) for _ in xrange(n_plots)]
    fig = gridplot([axs[0:4], axs[4:8], axs[8:12]]) # changed to accumulate all n_plots

    lines_plots = {}

    #### Show the plots with initial content ####
    session.show()

    #### update plots in loop ####
    for t in np.linspace(0, 50, 300):
        means = np.sin(t+offsets)
        stds = 1
        data = np.exp(-(x[:, None] - means)**2 / 2*stds**2) / np.sqrt(2*np.pi*stds**2)

        t_start = time.time()

        for ax in axs:
            # Get new data
            # Plot
            if t==0:
                lines_plots[ax] = [ax.line(x, data[:, i], line_width=4) for i in xrange(n_gaussians)]
            else:
                for i, r in enumerate(lines_plots[ax]):
                    r.data_source.data["y"] = data[:, i]

        print 'Time to update plot: %s' % (time.time() - t_start)
        time.sleep(1.0)

if __name__ == '__main__':
    demo_online_gaussians_v1(n_plots = 12,n_gaussians = 10)

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/2142de6e-2123-48fd-9fea-7fc3de0f8b2c%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Thx for the quick answer.
It is not a streaming use-case. I actually want to plot gaussians with changing mean and variances. So I actually need to replace all the y-values every update. What do you mean by “batching” API? Do you mean something like this does not exist yet? Maybe a workaround? :slight_smile:

Thx

Mat

···

On Wednesday, 6 April 2016 16:10:48 UTC+2, Bryan Van de ven wrote:

One question: do you need to update all the data in the plots every update, or is it actually a streaming use case? (That is, do you really just need to add new data points to what is there?) If so, the first thing to try is the streaming API, which you can see shown here:

    [https://github.com/bokeh/bokeh/tree/master/examples/app/ohlc](https://github.com/bokeh/bokeh/tree/master/examples/app/ohlc)

If you do need to update all the data, all the time, it might be time to look into some sort of “batching” API.

Bryan

On Apr 6, 2016, at 9:01 AM, Mat [email protected] wrote:

Hey,

I am trying to have a gridplot with potentially a large number of subplots. Each of these subplots again contains several lines. Altogether I assume this may be considered as “a lot of data” that needs to be transferred.

Each line is updated periodically and the changes are displayed in the respective subplot. From what I understand, each line is updated once its data is changed. This repeated synchronisation I assume to be the source of speed issues. My question therefore is: Is there a way to prevent the updates being synced with the plot until I explicitly give a command?

The following code is an example demonstrating the issue. On my laptop, each cycle takes in-between 0.25 to 0.4 seconds.

import numpy as np

import time

from bokeh.io import gridplot

from bokeh.client import push_session

from bokeh.plotting import figure, curdoc

“”"

Plots a bunch of dancing gaussians.

Currently, it’s a bit slow and some of the plots are only displaying after refreshing the browser

“”"

def demo_online_gaussians_v1(n_points = 100, n_gaussians = 10, n_plots = 12, (x_min, x_max) = (-5, 5)):

#### Destination Setup ####
session = push_session(curdoc(),session_id="Test1")
#### Plot Setup ####
x = np.linspace(x_min, x_max, n_points)
offsets = np.arange(n_gaussians)*2*np.pi/(n_gaussians+1)
axs = [figure(width = 300, height=300) for _ in xrange(n_plots)]
fig = gridplot([axs[0:4], axs[4:8], axs[8:12]]) # changed to accumulate all n_plots
lines_plots = {}
#### Show the plots with initial content ####
session.show()
#### update plots in loop ####
for t in np.linspace(0, 50, 300):
    means = np.sin(t+offsets)
    stds = 1
    data = np.exp(-(x[:, None] - means)**2 / 2*stds**2) / np.sqrt(2*np.pi*stds**2)
    t_start = time.time()
    for ax in axs:
        # Get new data
        # Plot
        if t==0:
            lines_plots[ax] = [ax.line(x, data[:, i], line_width=4) for i in xrange(n_gaussians)]
        else:
            for i, r in enumerate(lines_plots[ax]):
                r.data_source.data["y"] = data[:, i]
    print 'Time to update plot: %s' % (time.time() - t_start)
    time.sleep(1.0)

if name == ‘main’:

demo_online_gaussians_v1(n_plots = 12,n_gaussians = 10)


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/2142de6e-2123-48fd-9fea-7fc3de0f8b2c%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Correct, there is no API or mechanism to collect a larger batch of changes to send, each model property change generates an update. So the only workaround I can think of at the moment would be if all your data series happen to be the same size. Then you could update the .data dict on that source in one go (by making on new dict with all the data, not by updating e.g., .data["y1"] separately)

I also notice you are using bokeh.client in a separate python process than the server. This adds roughly double the network and serialization overhead (by definition, there are now two hops to make, there's nothing to do about that). You may find that a bokeh app running in the server (i.e., "bokeh serve myapp.py" style app) to be more performant.

Bryan

···

On Apr 6, 2016, at 9:13 AM, Mat <[email protected]> wrote:

Thx for the quick answer.
It is not a streaming use-case. I actually want to plot gaussians with changing mean and variances. So I actually need to replace all the y-values every update. What do you mean by "batching" API? Do you mean something like this does not exist yet? Maybe a workaround? :slight_smile:

Thx

Mat

On Wednesday, 6 April 2016 16:10:48 UTC+2, Bryan Van de ven wrote:
One question: do you need to update all the data in the plots every update, or is it actually a streaming use case? (That is, do you really just need to add new data points to what is there?) If so, the first thing to try is the streaming API, which you can see shown here:

        https://github.com/bokeh/bokeh/tree/master/examples/app/ohlc

If you do need to update all the data, all the time, it might be time to look into some sort of "batching" API.

Bryan

> On Apr 6, 2016, at 9:01 AM, Mat <[email protected]> wrote:
>
> Hey,
>
> I am trying to have a gridplot with potentially a large number of subplots. Each of these subplots again contains several lines. Altogether I assume this may be considered as "a lot of data" that needs to be transferred.
> Each line is updated periodically and the changes are displayed in the respective subplot. From what I understand, each line is updated once its data is changed. This repeated synchronisation I assume to be the source of speed issues. My question therefore is: Is there a way to prevent the updates being synced with the plot until I explicitly give a command?
>
> The following code is an example demonstrating the issue. On my laptop, each cycle takes in-between 0.25 to 0.4 seconds.
>
> import numpy as np
> import time
>
> from bokeh.io import gridplot
> from bokeh.client import push_session
> from bokeh.plotting import figure, curdoc
>
> """
> Plots a bunch of dancing gaussians.
>
> Currently, it's a bit slow and some of the plots are only displaying after refreshing the browser
> """
>
> def demo_online_gaussians_v1(n_points = 100, n_gaussians = 10, n_plots = 12, (x_min, x_max) = (-5, 5)):
>
>
> #### Destination Setup ####
> session = push_session(curdoc(),session_id="Test1")
>
>
> #### Plot Setup ####
> x = np.linspace(x_min, x_max, n_points)
> offsets = np.arange(n_gaussians)*2*np.pi/(n_gaussians+1)
>
> axs = [figure(width = 300, height=300) for _ in xrange(n_plots)]
> fig = gridplot([axs[0:4], axs[4:8], axs[8:12]]) # changed to accumulate all n_plots
>
> lines_plots = {}
>
> #### Show the plots with initial content ####
> session.show()
>
>
> #### update plots in loop ####
> for t in np.linspace(0, 50, 300):
> means = np.sin(t+offsets)
> stds = 1
> data = np.exp(-(x[:, None] - means)**2 / 2*stds**2) / np.sqrt(2*np.pi*stds**2)
>
> t_start = time.time()
>
> for ax in axs:
> # Get new data
> # Plot
> if t==0:
> lines_plots[ax] = [ax.line(x, data[:, i], line_width=4) for i in xrange(n_gaussians)]
> else:
> for i, r in enumerate(lines_plots[ax]):
> r.data_source.data["y"] = data[:, i]
>
> print 'Time to update plot: %s' % (time.time() - t_start)
> time.sleep(1.0)
>
>
>
>
> if __name__ == '__main__':
> demo_online_gaussians_v1(n_plots = 12,n_gaussians = 10)
>
> --
> You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To post to this group, send email to [email protected].
> To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/2142de6e-2123-48fd-9fea-7fc3de0f8b2c%40continuum.io.
> For more options, visit https://groups.google.com/a/continuum.io/d/optout.

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]um.io.
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/875cf24d-4f73-4083-8907-3ab1a9ca30e6%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Ah, ok good to know. I will look for the .data dict, this might work for me.

The problem with running “bokeh server myapp.py” is that the plotting functionality is only a small part of my whole setup. Plus, I have a distributed computing setup. I will think of implementing a “myapp.py” that collects and plots data that my main process has dumped to disc. Or is this exactly what is happening at the moment (in my current setup) anyways implicitly?

Thanks for helping me out on this :slight_smile:

Mat

···

On Wednesday, 6 April 2016 17:55:58 UTC+2, Bryan Van de ven wrote:

Correct, there is no API or mechanism to collect a larger batch of changes to send, each model property change generates an update. So the only workaround I can think of at the moment would be if all your data series happen to be the same size. Then you could update the .data dict on that source in one go (by making on new dict with all the data, not by updating e.g., .data[“y1”] separately)

I also notice you are using bokeh.client in a separate python process than the server. This adds roughly double the network and serialization overhead (by definition, there are now two hops to make, there’s nothing to do about that). You may find that a bokeh app running in the server (i.e., “bokeh serve myapp.py” style app) to be more performant.

Bryan

On Apr 6, 2016, at 9:13 AM, Mat [email protected] wrote:

Thx for the quick answer.
It is not a streaming use-case. I actually want to plot gaussians with changing mean and variances. So I actually need to replace all the y-values every update. What do you mean by “batching” API? Do you mean something like this does not exist yet? Maybe a workaround? :slight_smile:

Thx

Mat

On Wednesday, 6 April 2016 16:10:48 UTC+2, Bryan Van de ven wrote:

One question: do you need to update all the data in the plots every update, or is it actually a streaming use case? (That is, do you really just need to add new data points to what is there?) If so, the first thing to try is the streaming API, which you can see shown here:

    [https://github.com/bokeh/bokeh/tree/master/examples/app/ohlc](https://github.com/bokeh/bokeh/tree/master/examples/app/ohlc)

If you do need to update all the data, all the time, it might be time to look into some sort of “batching” API.

Bryan

On Apr 6, 2016, at 9:01 AM, Mat [email protected] wrote:

Hey,

I am trying to have a gridplot with potentially a large number of subplots. Each of these subplots again contains several lines. Altogether I assume this may be considered as “a lot of data” that needs to be transferred.
Each line is updated periodically and the changes are displayed in the respective subplot. From what I understand, each line is updated once its data is changed. This repeated synchronisation I assume to be the source of speed issues. My question therefore is: Is there a way to prevent the updates being synced with the plot until I explicitly give a command?

The following code is an example demonstrating the issue. On my laptop, each cycle takes in-between 0.25 to 0.4 seconds.

import numpy as np
import time

from bokeh.io import gridplot
from bokeh.client import push_session
from bokeh.plotting import figure, curdoc

“”"
Plots a bunch of dancing gaussians.

Currently, it’s a bit slow and some of the plots are only displaying after refreshing the browser
“”"

def demo_online_gaussians_v1(n_points = 100, n_gaussians = 10, n_plots = 12, (x_min, x_max) = (-5, 5)):

#### Destination Setup ####
session = push_session(curdoc(),session_id="Test1")


#### Plot Setup ####
x = np.linspace(x_min, x_max, n_points)
offsets = np.arange(n_gaussians)*2*np.pi/(n_gaussians+1)

axs = [figure(width = 300, height=300) for _ in xrange(n_plots)]
fig = gridplot([axs[0:4], axs[4:8], axs[8:12]]) # changed to accumulate all n_plots

lines_plots = {}

#### Show the plots with initial content ####
session.show()


#### update plots in loop ####
for t in np.linspace(0, 50, 300):
    means = np.sin(t+offsets)
    stds = 1
    data = np.exp(-(x[:, None] - means)**2 / 2*stds**2) / np.sqrt(2*np.pi*stds**2)

    t_start = time.time()

    for ax in axs:
        # Get new data
        # Plot
        if t==0:
            lines_plots[ax] = [ax.line(x, data[:, i], line_width=4) for i in xrange(n_gaussians)]
        else:
            for i, r in enumerate(lines_plots[ax]):
                r.data_source.data["y"] = data[:, i]

    print 'Time to update plot: %s' % (time.time() - t_start)
    time.sleep(1.0)

if name == ‘main’:
demo_online_gaussians_v1(n_plots = 12,n_gaussians = 10)


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/2142de6e-2123-48fd-9fea-7fc3de0f8b2c%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/875cf24d-4f73-4083-8907-3ab1a9ca30e6%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Ah, ok good to know. I will look for the .data dict, this might work for me.

The problem with running "bokeh server myapp.py" is that the plotting functionality is only a small part of my whole setup. Plus, I have a distributed computing setup. I will think of implementing a "myapp.py" that collects and plots data that my main process has dumped to disc. Or is this exactly what is happening at the moment (in my current setup) anyways implicitly?

No, the communication to the server is always over a websocket from any client, whether the client is a browser, or a different python process. With bokeh.client everything has to be serialized/unserialized from python to the server, and then serialized/unserialized again from server to browser. A future binary array protocol extension might help in this scenario, but it has not been implemented (though at least the groundwork has been laid).

One example I would love to see developed is the "bokeh app updates from redis" type example (substitute your favorite data store for redis) so if dumping your data into something like that and having a Bokeh app front-end it sounds interesting, perhaps we could collaborate. There may be rough edges to find and smooth over for that use-case though, so I understand if you are on a schedule that does not allow for much experimentation.

Thanks for helping me out on this :slight_smile:

My pleasure,

Bryan

···

On Apr 6, 2016, at 3:42 PM, Mat <[email protected]> wrote:

Mat

On Wednesday, 6 April 2016 17:55:58 UTC+2, Bryan Van de ven wrote:
Correct, there is no API or mechanism to collect a larger batch of changes to send, each model property change generates an update. So the only workaround I can think of at the moment would be if all your data series happen to be the same size. Then you could update the .data dict on that source in one go (by making on new dict with all the data, not by updating e.g., .data["y1"] separately)

I also notice you are using bokeh.client in a separate python process than the server. This adds roughly double the network and serialization overhead (by definition, there are now two hops to make, there's nothing to do about that). You may find that a bokeh app running in the server (i.e., "bokeh serve myapp.py" style app) to be more performant.

Bryan

> On Apr 6, 2016, at 9:13 AM, Mat <[email protected]> wrote:
>
> Thx for the quick answer.
> It is not a streaming use-case. I actually want to plot gaussians with changing mean and variances. So I actually need to replace all the y-values every update. What do you mean by "batching" API? Do you mean something like this does not exist yet? Maybe a workaround? :slight_smile:
>
> Thx
>
> Mat
>
> On Wednesday, 6 April 2016 16:10:48 UTC+2, Bryan Van de ven wrote:
> One question: do you need to update all the data in the plots every update, or is it actually a streaming use case? (That is, do you really just need to add new data points to what is there?) If so, the first thing to try is the streaming API, which you can see shown here:
>
> https://github.com/bokeh/bokeh/tree/master/examples/app/ohlc
>
> If you do need to update all the data, all the time, it might be time to look into some sort of "batching" API.
>
> Bryan
>
>
> > On Apr 6, 2016, at 9:01 AM, Mat <[email protected]> wrote:
> >
> > Hey,
> >
> > I am trying to have a gridplot with potentially a large number of subplots. Each of these subplots again contains several lines. Altogether I assume this may be considered as "a lot of data" that needs to be transferred.
> > Each line is updated periodically and the changes are displayed in the respective subplot. From what I understand, each line is updated once its data is changed. This repeated synchronisation I assume to be the source of speed issues. My question therefore is: Is there a way to prevent the updates being synced with the plot until I explicitly give a command?
> >
> > The following code is an example demonstrating the issue. On my laptop, each cycle takes in-between 0.25 to 0.4 seconds.
> >
> > import numpy as np
> > import time
> >
> > from bokeh.io import gridplot
> > from bokeh.client import push_session
> > from bokeh.plotting import figure, curdoc
> >
> > """
> > Plots a bunch of dancing gaussians.
> >
> > Currently, it's a bit slow and some of the plots are only displaying after refreshing the browser
> > """
> >
> > def demo_online_gaussians_v1(n_points = 100, n_gaussians = 10, n_plots = 12, (x_min, x_max) = (-5, 5)):
> >
> >
> > #### Destination Setup ####
> > session = push_session(curdoc(),session_id="Test1")
> >
> >
> > #### Plot Setup ####
> > x = np.linspace(x_min, x_max, n_points)
> > offsets = np.arange(n_gaussians)*2*np.pi/(n_gaussians+1)
> >
> > axs = [figure(width = 300, height=300) for _ in xrange(n_plots)]
> > fig = gridplot([axs[0:4], axs[4:8], axs[8:12]]) # changed to accumulate all n_plots
> >
> > lines_plots = {}
> >
> > #### Show the plots with initial content ####
> > session.show()
> >
> >
> > #### update plots in loop ####
> > for t in np.linspace(0, 50, 300):
> > means = np.sin(t+offsets)
> > stds = 1
> > data = np.exp(-(x[:, None] - means)**2 / 2*stds**2) / np.sqrt(2*np.pi*stds**2)
> >
> > t_start = time.time()
> >
> > for ax in axs:
> > # Get new data
> > # Plot
> > if t==0:
> > lines_plots[ax] = [ax.line(x, data[:, i], line_width=4) for i in xrange(n_gaussians)]
> > else:
> > for i, r in enumerate(lines_plots[ax]):
> > r.data_source.data["y"] = data[:, i]
> >
> > print 'Time to update plot: %s' % (time.time() - t_start)
> > time.sleep(1.0)
> >
> >
> >
> >
> > if __name__ == '__main__':
> > demo_online_gaussians_v1(n_plots = 12,n_gaussians = 10)
> >
> > --
> > You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> > To post to this group, send email to [email protected].
> > To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/2142de6e-2123-48fd-9fea-7fc3de0f8b2c%40continuum.io.
> > For more options, visit https://groups.google.com/a/continuum.io/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To post to this group, send email to [email protected].
> To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/875cf24d-4f73-4083-8907-3ab1a9ca30e6%40continuum.io.
> For more options, visit https://groups.google.com/a/continuum.io/d/optout.

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/4259a92c-42fc-4d59-8c00-0e52cb7d19cd%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Hey Bryan,

at the moment, I don’t have the capacity (who ever has…), but I will get back to you if we decide to use bokeh more extensively in our own projects. This might be happening.

Matthias

···

On Wednesday, 6 April 2016 22:49:21 UTC+2, Bryan Van de ven wrote:

On Apr 6, 2016, at 3:42 PM, Mat [email protected] wrote:

Ah, ok good to know. I will look for the .data dict, this might work for me.

The problem with running “bokeh server myapp.py” is that the plotting functionality is only a small part of my whole setup. Plus, I have a distributed computing setup. I will think of implementing a “myapp.py” that collects and plots data that my main process has dumped to disc. Or is this exactly what is happening at the moment (in my current setup) anyways implicitly?

No, the communication to the server is always over a websocket from any client, whether the client is a browser, or a different python process. With bokeh.client everything has to be serialized/unserialized from python to the server, and then serialized/unserialized again from server to browser. A future binary array protocol extension might help in this scenario, but it has not been implemented (though at least the groundwork has been laid).

One example I would love to see developed is the “bokeh app updates from redis” type example (substitute your favorite data store for redis) so if dumping your data into something like that and having a Bokeh app front-end it sounds interesting, perhaps we could collaborate. There may be rough edges to find and smooth over for that use-case though, so I understand if you are on a schedule that does not allow for much experimentation.

Thanks for helping me out on this :slight_smile:

My pleasure,

Bryan

Mat

On Wednesday, 6 April 2016 17:55:58 UTC+2, Bryan Van de ven wrote:

Correct, there is no API or mechanism to collect a larger batch of changes to send, each model property change generates an update. So the only workaround I can think of at the moment would be if all your data series happen to be the same size. Then you could update the .data dict on that source in one go (by making on new dict with all the data, not by updating e.g., .data[“y1”] separately)

I also notice you are using bokeh.client in a separate python process than the server. This adds roughly double the network and serialization overhead (by definition, there are now two hops to make, there’s nothing to do about that). You may find that a bokeh app running in the server (i.e., “bokeh serve myapp.py” style app) to be more performant.

Bryan

On Apr 6, 2016, at 9:13 AM, Mat [email protected] wrote:

Thx for the quick answer.
It is not a streaming use-case. I actually want to plot gaussians with changing mean and variances. So I actually need to replace all the y-values every update. What do you mean by “batching” API? Do you mean something like this does not exist yet? Maybe a workaround? :slight_smile:

Thx

Mat

On Wednesday, 6 April 2016 16:10:48 UTC+2, Bryan Van de ven wrote:
One question: do you need to update all the data in the plots every update, or is it actually a streaming use case? (That is, do you really just need to add new data points to what is there?) If so, the first thing to try is the streaming API, which you can see shown here:

    [https://github.com/bokeh/bokeh/tree/master/examples/app/ohlc](https://github.com/bokeh/bokeh/tree/master/examples/app/ohlc)

If you do need to update all the data, all the time, it might be time to look into some sort of “batching” API.

Bryan

On Apr 6, 2016, at 9:01 AM, Mat [email protected] wrote:

Hey,

I am trying to have a gridplot with potentially a large number of subplots. Each of these subplots again contains several lines. Altogether I assume this may be considered as “a lot of data” that needs to be transferred.
Each line is updated periodically and the changes are displayed in the respective subplot. From what I understand, each line is updated once its data is changed. This repeated synchronisation I assume to be the source of speed issues. My question therefore is: Is there a way to prevent the updates being synced with the plot until I explicitly give a command?

The following code is an example demonstrating the issue. On my laptop, each cycle takes in-between 0.25 to 0.4 seconds.

import numpy as np
import time

from bokeh.io import gridplot
from bokeh.client import push_session
from bokeh.plotting import figure, curdoc

“”"
Plots a bunch of dancing gaussians.

Currently, it’s a bit slow and some of the plots are only displaying after refreshing the browser
“”"

def demo_online_gaussians_v1(n_points = 100, n_gaussians = 10, n_plots = 12, (x_min, x_max) = (-5, 5)):

#### Destination Setup ####
session = push_session(curdoc(),session_id="Test1")


#### Plot Setup ####
x = np.linspace(x_min, x_max, n_points)
offsets = np.arange(n_gaussians)*2*np.pi/(n_gaussians+1)

axs = [figure(width = 300, height=300) for _ in xrange(n_plots)]
fig = gridplot([axs[0:4], axs[4:8], axs[8:12]]) # changed to accumulate all n_plots

lines_plots = {}

#### Show the plots with initial content ####
session.show()


#### update plots in loop ####
for t in np.linspace(0, 50, 300):
    means = np.sin(t+offsets)
    stds = 1
    data = np.exp(-(x[:, None] - means)**2 / 2*stds**2) / np.sqrt(2*np.pi*stds**2)

    t_start = time.time()

    for ax in axs:
        # Get new data
        # Plot
        if t==0:
            lines_plots[ax] = [ax.line(x, data[:, i], line_width=4) for i in xrange(n_gaussians)]
        else:
            for i, r in enumerate(lines_plots[ax]):
                r.data_source.data["y"] = data[:, i]

    print 'Time to update plot: %s' % (time.time() - t_start)
    time.sleep(1.0)

if name == ‘main’:
demo_online_gaussians_v1(n_plots = 12,n_gaussians = 10)


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/2142de6e-2123-48fd-9fea-7fc3de0f8b2c%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/875cf24d-4f73-4083-8907-3ab1a9ca30e6%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/4259a92c-42fc-4d59-8c00-0e52cb7d19cd%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.