Real Time update slower with every iteration

Hi Guys, I am fairly new here. I have created an app which gets data from the sql server and streams using ColumnDataSource. I have used the stream option in CDS to keep the data limited.
I am also using periodic callbacks to update this data and the plots.
However, the plotting functions take slightly longer at every iteration. The update function looks something like this.

def update():
get_data_from_server_and_filter_it()
plot_data()

def plot_data():
figure.line(x = ‘x’, y = ‘y’, source = CDS)

The get_data_from_server part takes equal time at each iteration. (It basically filters the data from sql and streams it to CDS)
plot_data() takes more time at every iteration(0.01 secs more at each iteration) which I dont understand as the size of data in CDS is same because of CDS. After about 200 iterations the time required is more than 3 secs for each iteration.

I am using bokeh serve --show command in anaconda to run the server. Once the plots take more time, I reset the server manually using ctrl+c. Then it runs much more quicker (0.8 secs) per iteration, however the time slowly increases.

Any help is appreciated! Cheers! :slight_smile:

Hi,

It's possibly a bug, but in any case it would require investigation to figure out, which means running actual code. Can you provide a complete minimal script to reproduce the issue?

Thanks,

Bryan

···

On Jun 8, 2018, at 09:35, Swaraj Oturkar <[email protected]> wrote:

Hi Guys, I am fairly new here. I have created an app which gets data from the sql server and streams using ColumnDataSource. I have used the stream option in CDS to keep the data limited.
I am also using periodic callbacks to update this data and the plots.
However, the plotting functions take slightly longer at every iteration. The update function looks something like this.

def update():
get_data_from_server_and_filter_it()
plot_data()

def plot_data():
figure.line(x = 'x', y = 'y', source = CDS)

The get_data_from_server part takes equal time at each iteration. (It basically filters the data from sql and streams it to CDS)
plot_data() takes more time at every iteration(0.01 secs more at each iteration) which I dont understand as the size of data in CDS is same because of CDS. After about 200 iterations the time required is more than 3 secs for each iteration.

I am using bokeh serve --show command in anaconda to run the server. Once the plots take more time, I reset the server manually using ctrl+c. Then it runs much more quicker (0.8 secs) per iteration, however the time slowly increases.

Any help is appreciated! Cheers! :slight_smile:

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/90867092-bcf4-4e94-8297-6ae9e75b4b7a%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Hello, I fixed it. It was a problem with ghost renderers which for some reason were getting accumulated in the figure. Basically, the statement fig.line(x=x,y=y,source = cds) was causing renderers to get accumulated in the figure with every callback. I replaced the statement with glph = fig.line(x = list(), y = list()) and removed the cds altogether. Then I kept appending glph to a list. Then I cleared the glyphs every minute with the fig.renderers.remove(glyph for glyph in glyphs). Ofcourse, I did not remove all of them. I kept the last couple of glyphs. This prevented the plotting time increase with every iteration.
I may be completely wrong in my understanding but I worked.

···

On Friday, June 8, 2018 at 6:54:59 PM UTC+2, Bryan Van de ven wrote:

Hi,

It’s possibly a bug, but in any case it would require investigation to figure out, which means running actual code. Can you provide a complete minimal script to reproduce the issue?

Thanks,

Bryan

On Jun 8, 2018, at 09:35, Swaraj Oturkar [email protected] wrote:

Hi Guys, I am fairly new here. I have created an app which gets data from the sql server and streams using ColumnDataSource. I have used the stream option in CDS to keep the data limited.

I am also using periodic callbacks to update this data and the plots.

However, the plotting functions take slightly longer at every iteration. The update function looks something like this.

def update():

get_data_from_server_and_filter_it()

plot_data()

def plot_data():

figure.line(x = ‘x’, y = ‘y’, source = CDS)

The get_data_from_server part takes equal time at each iteration. (It basically filters the data from sql and streams it to CDS)

plot_data() takes more time at every iteration(0.01 secs more at each iteration) which I dont understand as the size of data in CDS is same because of CDS. After about 200 iterations the time required is more than 3 secs for each iteration.

I am using bokeh serve --show command in anaconda to run the server. Once the plots take more time, I reset the server manually using ctrl+c. Then it runs much more quicker (0.8 secs) per iteration, however the time slowly increases.

Any help is appreciated! Cheers! :slight_smile:


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/90867092-bcf4-4e94-8297-6ae9e75b4b7a%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Hi,

No you are correct. Every time you do something like "fig.line" this adds a new entire glyph and possibly CDS as well. It's not a lightweight operation. The best practice for bokeh is to "set up" a plot or app up front by defining all the data sources, glyphs, etc. once. Then later, only updating the CDS data sources in order to change things. Updating CDS that already exist is a very optimized and efficient operation.

Thanks,

Bryan

···

On Jun 9, 2018, at 12:27, Swaraj Oturkar <[email protected]> wrote:

Hello, I fixed it. It was a problem with ghost renderers which for some reason were getting accumulated in the figure. Basically, the statement fig.line(x=x,y=y,source = cds) was causing renderers to get accumulated in the figure with every callback. I replaced the statement with glph = fig.line(x = list(), y = list()) and removed the cds altogether. Then I kept appending glph to a list. Then I cleared the glyphs every minute with the fig.renderers.remove(glyph for glyph in glyphs). Ofcourse, I did not remove all of them. I kept the last couple of glyphs. This prevented the plotting time increase with every iteration.
I may be completely wrong in my understanding but I worked.

On Friday, June 8, 2018 at 6:54:59 PM UTC+2, Bryan Van de ven wrote:
Hi,

It's possibly a bug, but in any case it would require investigation to figure out, which means running actual code. Can you provide a complete minimal script to reproduce the issue?

Thanks,

Bryan

> On Jun 8, 2018, at 09:35, Swaraj Oturkar <[email protected]> wrote:
>
> Hi Guys, I am fairly new here. I have created an app which gets data from the sql server and streams using ColumnDataSource. I have used the stream option in CDS to keep the data limited.
> I am also using periodic callbacks to update this data and the plots.
> However, the plotting functions take slightly longer at every iteration. The update function looks something like this.
>
>
> def update():
> get_data_from_server_and_filter_it()
> plot_data()
>
> def plot_data():
> figure.line(x = 'x', y = 'y', source = CDS)
>
> The get_data_from_server part takes equal time at each iteration. (It basically filters the data from sql and streams it to CDS)
> plot_data() takes more time at every iteration(0.01 secs more at each iteration) which I dont understand as the size of data in CDS is same because of CDS. After about 200 iterations the time required is more than 3 secs for each iteration.
>
> I am using bokeh serve --show command in anaconda to run the server. Once the plots take more time, I reset the server manually using ctrl+c. Then it runs much more quicker (0.8 secs) per iteration, however the time slowly increases.
>
> Any help is appreciated! Cheers! :slight_smile:
>
> --
> You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To post to this group, send email to [email protected].
> To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/90867092-bcf4-4e94-8297-6ae9e75b4b7a%40continuum.io.
> For more options, visit https://groups.google.com/a/continuum.io/d/optout.

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/169d32e4-03b1-4d51-9be6-3211f0dbdc7a%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Hello Bryan,
I had initially done exactly that. However, I needed to plot different signals as per the selection by user on the same figure. As a result I created a CDS for every option in the dropdown. Once a user selects an option he must be able to see that line on the same figure along side the previous plot. Then a reset button clears the plot. I thus had to maintain many CDS at once. I hence couldnt figure out why my plotting function was taking incrementally more time with every plot.
Thanks for your help and advice mate! Cheers!

···

On Saturday, June 9, 2018 at 10:03:21 PM UTC+2, Bryan Van de ven wrote:

Hi,

No you are correct. Every time you do something like “fig.line” this adds a new entire glyph and possibly CDS as well. It’s not a lightweight operation. The best practice for bokeh is to “set up” a plot or app up front by defining all the data sources, glyphs, etc. once. Then later, only updating the CDS data sources in order to change things. Updating CDS that already exist is a very optimized and efficient operation.

Thanks,

Bryan

On Jun 9, 2018, at 12:27, Swaraj Oturkar [email protected] wrote:

Hello, I fixed it. It was a problem with ghost renderers which for some reason were getting accumulated in the figure. Basically, the statement fig.line(x=x,y=y,source = cds) was causing renderers to get accumulated in the figure with every callback. I replaced the statement with glph = fig.line(x = list(), y = list()) and removed the cds altogether. Then I kept appending glph to a list. Then I cleared the glyphs every minute with the fig.renderers.remove(glyph for glyph in glyphs). Ofcourse, I did not remove all of them. I kept the last couple of glyphs. This prevented the plotting time increase with every iteration.

I may be completely wrong in my understanding but I worked.

On Friday, June 8, 2018 at 6:54:59 PM UTC+2, Bryan Van de ven wrote:

Hi,

It’s possibly a bug, but in any case it would require investigation to figure out, which means running actual code. Can you provide a complete minimal script to reproduce the issue?

Thanks,

Bryan

On Jun 8, 2018, at 09:35, Swaraj Oturkar [email protected] wrote:

Hi Guys, I am fairly new here. I have created an app which gets data from the sql server and streams using ColumnDataSource. I have used the stream option in CDS to keep the data limited.
I am also using periodic callbacks to update this data and the plots.
However, the plotting functions take slightly longer at every iteration. The update function looks something like this.

def update():
get_data_from_server_and_filter_it()
plot_data()

def plot_data():
figure.line(x = ‘x’, y = ‘y’, source = CDS)

The get_data_from_server part takes equal time at each iteration. (It basically filters the data from sql and streams it to CDS)
plot_data() takes more time at every iteration(0.01 secs more at each iteration) which I dont understand as the size of data in CDS is same because of CDS. After about 200 iterations the time required is more than 3 secs for each iteration.

I am using bokeh serve --show command in anaconda to run the server. Once the plots take more time, I reset the server manually using ctrl+c. Then it runs much more quicker (0.8 secs) per iteration, however the time slowly increases.

Any help is appreciated! Cheers! :slight_smile:


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/90867092-bcf4-4e94-8297-6ae9e75b4b7a%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/169d32e4-03b1-4d51-9be6-3211f0dbdc7a%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.