ColumnDataSource object changes data type from Numpy Array to List

Hi All,

I am working on a project that streams data to a ColumnDataSource and then uses the data in the column data source to create a streaming plot. Because this is not a public project I dont think I can reveal the whole code but I will describe what is happening here.

I first create a ColumnDataSource object:

source = ColumnDataSource(dict(
    time=np.zeros(rollingWindowSize), roomTemp=np.zeros(rollingWindowSize)))

I then record data from a temperature sensor and update the ColumnDataSource as (note that im initializing the new data dictionary with an old value for the room temperature):

newData = dict(time=[t], roomTemp=[RT_store[-1]])
newData['roomTemp'] = [RT_store[-1]]
source.stream(newData,rollover=rollingWindowSize)

I then begin monitoring the plot. I noticed then that within the first 5 minutes of running the code the plot resets several times and then stabilizes for the remainder of the running time. Digging through the code I found that what happens is the data type of the source data is changing.

Before (ndarray):
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

After (list):
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.7, 22.7, 22.7]

In essence, type(source.data[‘roomTemp’]) changes from numpy array to list. When this happens the list is initially empty so the plot is reset. I’m not sure why the data type is reset, however, I have tracked the change to the stream function in Bokeh sources. Here the value of the oldkeys variable changes to unicode strings when the data type changes to list.

def stream(self, new_data, rollover=None):
    import numpy as np

    newkeys = set(new_data.keys())
    oldkeys = set(self.data.keys())

    print "newkeys " +str(newkeys)
    print "oldkeys "+str(oldkeys)

    if newkeys != oldkeys:
        missing = oldkeys - newkeys
        extra = newkeys - oldkeys
        if missing and extra:
            raise ValueError("Must stream updates to all existing columns (missing: %s, extra: %s)" % (", ".join(sorted(missing)), ", ".join(sorted(extra))))
        elif missing:
            raise ValueError("Must stream updates to all existing columns (missing: %s)" % ", ".join(sorted(missing)))
        else:
            raise ValueError("Must stream updates to all existing columns (extra: %s)" % ", ".join(sorted(extra)))

    lengths = set()
    for x in new_data.values():
        if isinstance(x, np.ndarray):
            if len(x.shape) != 1:
                raise ValueError("stream(...) only supports 1d sequences, got ndarray with size %r" % (x.shape,))
            lengths.add(x.shape[0])
        else:
            lengths.add(len(x))

    if len(lengths) > 1:
        raise ValueError("All streaming column updates must be the same length")

    print "before "+str(new_data.values())
    self.data._stream(self.document, self, new_data, rollover)

newkeys set([‘time’, ‘roomTemp’])
oldkeys set([u’time’, u’roomTemp’])

Does anyone know why this is happening? If not, where can I find this function to continue troubleshooting the code:

self.data._stream(self.document, self, new_data, rollover)

Hi,

This is ultimately a result of the fact that all the data has to be serialized to be sent to and from the browser, that all serialization is currently just "JSON everything", and that JSON has no notion of numpy arrays (only lists).

There is an open PR for binary array serialization that may ameliorate this situation in some cases:

  https://github.com/bokeh/bokeh/pull/5544

But a few caveats should be noted:

* Only certain numpy array dtypes can be serialized and round tripped in this way. Only array types that correspond to available JS typed arrays (i.e. float32, float64, etc) can be.

* I doubt (at least to start) that object identity can be maintained. i.e., CDS columns that start as arrays can remain arrays, but any array round tripped from a client might be a *new, different* array object at some point.

It's also possible there are other issues to look into and improve or fix. For instance, there may be issues specific to .stream that are also coincidentally causing such conversions. It's also possible that object identity can be maintained with more care, but that will require experimentation and benchmarking.

Thanks,

Bryan

···

On Dec 15, 2016, at 12:41 PM, RedRaven <[email protected]> wrote:

Hi All,

I am working on a project that streams data to a ColumnDataSource and then uses the data in the column data source to create a streaming plot. Because this is not a public project I dont think I can reveal the whole code but I will describe what is happening here.

I first create a ColumnDataSource object:
source = ColumnDataSource(dict(
    time=np.zeros(rollingWindowSize), roomTemp=np.zeros(rollingWindowSize)))

I then record data from a temperature sensor and update the ColumnDataSource as (note that im initializing the new data dictionary with an old value for the room temperature):

newData = dict(time=[t], roomTemp=[RT_store[-1]])
newData['roomTemp'] = [RT_store[-1]]
source.stream(newData,rollover=rollingWindowSize)

I then begin monitoring the plot. I noticed then that within the first 5 minutes of running the code the plot resets several times and then stabilizes for the remainder of the running time. Digging through the code I found that what happens is the data type of the source data is changing.

Before (ndarray):
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

After (list):
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.7, 22.7, 22.7]

In essence, type(source.data['roomTemp']) changes from numpy array to list. When this happens the list is initially empty so the plot is reset. I'm not sure why the data type is reset, however, I have tracked the change to the stream function in Bokeh sources. Here the value of the oldkeys variable changes to unicode strings when the data type changes to list.

def stream(self, new_data, rollover=None):
    import numpy as np

    newkeys = set(new_data.keys())
    oldkeys = set(self.data.keys())

    print "newkeys " +str(newkeys)
    print "oldkeys "+str(oldkeys)

    if newkeys != oldkeys:
        missing = oldkeys - newkeys
        extra = newkeys - oldkeys
        if missing and extra:
            raise ValueError("Must stream updates to all existing columns (missing: %s, extra: %s)" % (", ".join(sorted(missing)), ", ".join(sorted(extra))))
        elif missing:
            raise ValueError("Must stream updates to all existing columns (missing: %s)" % ", ".join(sorted(missing)))
        else:
            raise ValueError("Must stream updates to all existing columns (extra: %s)" % ", ".join(sorted(extra)))

    lengths = set()
    for x in new_data.values():
        if isinstance(x, np.ndarray):
            if len(x.shape) != 1:
                raise ValueError("stream(...) only supports 1d sequences, got ndarray with size %r" % (x.shape,))
            lengths.add(x.shape[0])
        else:
            lengths.add(len(x))

    if len(lengths) > 1:
        raise ValueError("All streaming column updates must be the same length")

    print "before "+str(new_data.values())
    self.data._stream(self.document, self, new_data, rollover)

newkeys set(['time', 'roomTemp'])
oldkeys set([u'time', u'roomTemp'])

Does anyone know why this is happening? If not, where can I find this function to continue troubleshooting the code:
self.data._stream(self.document, self, new_data, rollover)

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/6d8b58a6-44cd-4e30-be03-420ebd81ad02%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Thanks for the thorough reply Bryan. Does this imply that if I start the columndatasource with a list it should stay consistent?

There isnt a particular reason why this should stay in array format.

···

On Thursday, 15 December 2016 13:41:03 UTC-5, RedRaven wrote:

Hi All,

I am working on a project that streams data to a ColumnDataSource and then uses the data in the column data source to create a streaming plot. Because this is not a public project I dont think I can reveal the whole code but I will describe what is happening here.

I first create a ColumnDataSource object:

source = ColumnDataSource(dict(
    time=np.zeros(rollingWindowSize), roomTemp=np.zeros(rollingWindowSize)))


I then record data from a temperature sensor and update the ColumnDataSource as (note that im initializing the new data dictionary with an old value for the room temperature):

newData = dict(time=[t], roomTemp=[RT_store[-1]])
newData['roomTemp'] = [RT_store[-1]]
source.stream(newData,rollover=rollingWindowSize)

I then begin monitoring the plot. I noticed then that within the first 5 minutes of running the code the plot resets several times and then stabilizes for the remainder of the running time. Digging through the code I found that what happens is the data type of the source data is changing.

Before (ndarray):
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

After (list):
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.7, 22.7, 22.7]

In essence, type(source.data[‘roomTemp’]) changes from numpy array to list. When this happens the list is initially empty so the plot is reset. I’m not sure why the data type is reset, however, I have tracked the change to the stream function in Bokeh sources. Here the value of the oldkeys variable changes to unicode strings when the data type changes to list.

def stream(self, new_data, rollover=None):
    import numpy as np

    newkeys = set(new_data.keys())
    oldkeys = set(self.data.keys())

    print "newkeys " +str(newkeys)
    print "oldkeys "+str(oldkeys)

    if newkeys != oldkeys:
        missing = oldkeys - newkeys
        extra = newkeys - oldkeys
        if missing and extra:
            raise ValueError("Must stream updates to all existing columns (missing: %s, extra: %s)" % (", ".join(sorted(missing)), ", ".join(sorted(extra))))
        elif missing:
            raise ValueError("Must stream updates to all existing columns (missing: %s)" % ", ".join(sorted(missing)))
        else:
            raise ValueError("Must stream updates to all existing columns (extra: %s)" % ", ".join(sorted(extra)))

    lengths = set()
    for x in new_data.values():
        if isinstance(x, np.ndarray):
            if len(x.shape) != 1:
                raise ValueError("stream(...) only supports 1d sequences, got ndarray with size %r" % (x.shape,))
            lengths.add(x.shape[0])
        else:
            lengths.add(len(x))

    if len(lengths) > 1:
        raise ValueError("All streaming column updates must be the same length")

    print "before "+str(new_data.values())
    self.data._stream(self.document, self, new_data, rollover)

newkeys set([‘time’, ‘roomTemp’])
oldkeys set([u’time’, u’roomTemp’])

Does anyone know why this is happening? If not, where can I find this function to continue troubleshooting the code:

self.data._stream(self.document, self, new_data, rollover)



Hi,

Yes, I believe that would be the case currently. (But please alert if I'm mistaken)

Thanks,

Bryan

···

On Dec 15, 2016, at 1:07 PM, RedRaven <[email protected]> wrote:

Thanks for the thorough reply Bryan. Does this imply that if I start the columndatasource with a list it should stay consistent?

There isnt a particular reason why this should stay in array format.

On Thursday, 15 December 2016 13:41:03 UTC-5, RedRaven wrote:
Hi All,

I am working on a project that streams data to a ColumnDataSource and then uses the data in the column data source to create a streaming plot. Because this is not a public project I dont think I can reveal the whole code but I will describe what is happening here.

I first create a ColumnDataSource object:
source = ColumnDataSource(dict(
    time=np.zeros(rollingWindowSize), roomTemp=np.zeros(rollingWindowSize)))

I then record data from a temperature sensor and update the ColumnDataSource as (note that im initializing the new data dictionary with an old value for the room temperature):

newData = dict(time=[t], roomTemp=[RT_store[-1]])
newData['roomTemp'] = [RT_store[-1]]
source.stream(newData,rollover=rollingWindowSize)

I then begin monitoring the plot. I noticed then that within the first 5 minutes of running the code the plot resets several times and then stabilizes for the remainder of the running time. Digging through the code I found that what happens is the data type of the source data is changing.

Before (ndarray):
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

After (list):
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.7, 22.7, 22.7]

In essence, type(source.data['roomTemp']) changes from numpy array to list. When this happens the list is initially empty so the plot is reset. I'm not sure why the data type is reset, however, I have tracked the change to the stream function in Bokeh sources. Here the value of the oldkeys variable changes to unicode strings when the data type changes to list.

def stream(self, new_data, rollover=None):
    import numpy as np

    newkeys = set(new_data.keys())
    oldkeys = set(self.data.keys())

    print "newkeys " +str(newkeys)
    print "oldkeys "+str(oldkeys)

    if newkeys != oldkeys:
        missing = oldkeys - newkeys
        extra = newkeys - oldkeys
        if missing and extra:
            raise ValueError("Must stream updates to all existing columns (missing: %s, extra: %s)" % (", ".join(sorted(missing)), ", ".join(sorted(extra))))
        elif missing:
            raise ValueError("Must stream updates to all existing columns (missing: %s)" % ", ".join(sorted(missing)))
        else:
            raise ValueError("Must stream updates to all existing columns (extra: %s)" % ", ".join(sorted(extra)))

    lengths = set()
    for x in new_data.values():
        if isinstance(x, np.ndarray):
            if len(x.shape) != 1:
                raise ValueError("stream(...) only supports 1d sequences, got ndarray with size %r" % (x.shape,))
            lengths.add(x.shape[0])
        else:
            lengths.add(len(x))

    if len(lengths) > 1:
        raise ValueError("All streaming column updates must be the same length")

    print "before "+str(new_data.values())
    self.data._stream(self.document, self, new_data, rollover)

newkeys set(['time', 'roomTemp'])
oldkeys set([u'time', u'roomTemp'])

Does anyone know why this is happening? If not, where can I find this function to continue troubleshooting the code:
self.data._stream(self.document, self, new_data, rollover)

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/e6c8b2da-eede-498d-86bd-9f2b74f8280c%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Thanks Bryan. I tried the change. Ie. I initialized the CDS with lists:

source = ColumnDataSource(dict(
    time=[0]*rollingWindowSize, roomTemp=[0]*rollingWindowSize))

I also updated the values that are being input to the CDS to be float32. Now the type of data in the CDS is consistently a list, however, this list is still being cleared a few minutes into the run. Interestingly enough it eventually stabilizes and does not clear again.

···

On Thursday, 15 December 2016 14:12:32 UTC-5, Bryan Van de ven wrote:

Hi,

Yes, I believe that would be the case currently. (But please alert if I’m mistaken)

Thanks,

Bryan

On Dec 15, 2016, at 1:07 PM, RedRaven [email protected] wrote:

Thanks for the thorough reply Bryan. Does this imply that if I start the columndatasource with a list it should stay consistent?

There isnt a particular reason why this should stay in array format.

On Thursday, 15 December 2016 13:41:03 UTC-5, RedRaven wrote:

Hi All,

I am working on a project that streams data to a ColumnDataSource and then uses the data in the column data source to create a streaming plot. Because this is not a public project I dont think I can reveal the whole code but I will describe what is happening here.

I first create a ColumnDataSource object:

source = ColumnDataSource(dict(

time=np.zeros(rollingWindowSize), roomTemp=np.zeros(rollingWindowSize)))

I then record data from a temperature sensor and update the ColumnDataSource as (note that im initializing the new data dictionary with an old value for the room temperature):

newData = dict(time=[t], roomTemp=[RT_store[-1]])

newData[‘roomTemp’] = [RT_store[-1]]

source.stream(newData,rollover=rollingWindowSize)

I then begin monitoring the plot. I noticed then that within the first 5 minutes of running the code the plot resets several times and then stabilizes for the remainder of the running time. Digging through the code I found that what happens is the data type of the source data is changing.

Before (ndarray):

[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.

                  1. 0.]

After (list):

[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.7, 22.7, 22.7]

In essence, type(source.data[‘roomTemp’]) changes from numpy array to list. When this happens the list is initially empty so the plot is reset. I’m not sure why the data type is reset, however, I have tracked the change to the stream function in Bokeh sources. Here the value of the oldkeys variable changes to unicode strings when the data type changes to list.

def stream(self, new_data, rollover=None):

import numpy as np
newkeys = set(new_data.keys())
oldkeys = set(self.data.keys())
print "newkeys " +str(newkeys)
print "oldkeys "+str(oldkeys)
if newkeys != oldkeys:
    missing = oldkeys - newkeys
    extra = newkeys - oldkeys
    if missing and extra:
        raise ValueError("Must stream updates to all existing columns (missing: %s, extra: %s)" % (", ".join(sorted(missing)), ", ".join(sorted(extra))))
    elif missing:
        raise ValueError("Must stream updates to all existing columns (missing: %s)" % ", ".join(sorted(missing)))
    else:
        raise ValueError("Must stream updates to all existing columns (extra: %s)" % ", ".join(sorted(extra)))
lengths = set()
for x in new_data.values():
    if isinstance(x, np.ndarray):
        if len(x.shape) != 1:
            raise ValueError("stream(...) only supports 1d sequences, got ndarray with size %r" % (x.shape,))
        lengths.add(x.shape[0])
    else:
        lengths.add(len(x))
if len(lengths) > 1:
    raise ValueError("All streaming column updates must be the same length")
print "before "+str(new_data.values())
self.data._stream(self.document, self, new_data, rollover)

newkeys set([‘time’, ‘roomTemp’])

oldkeys set([u’time’, u’roomTemp’])

Does anyone know why this is happening? If not, where can I find this function to continue troubleshooting the code:
self.data._stream(self.document, self, new_data, rollover)


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/e6c8b2da-eede-498d-86bd-9f2b74f8280c%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Ok,

There might be a separate bug here to look into, If you can I'd suggest making a new GH issue with a code sample that can be run. My focus for 0.12.5 will be some of these protocol-related issues so this would be good to consider.

Bryan

···

On Dec 15, 2016, at 1:34 PM, RedRaven <[email protected]> wrote:

Thanks Bryan. I tried the change. Ie. I initialized the CDS with lists:

source = ColumnDataSource(dict(
    time=[0]*rollingWindowSize, roomTemp=[0]*rollingWindowSize))

I also updated the values that are being input to the CDS to be float32. Now the type of data in the CDS is consistently a list, however, this list is still being cleared a few minutes into the run. Interestingly enough it eventually stabilizes and does not clear again.

On Thursday, 15 December 2016 14:12:32 UTC-5, Bryan Van de ven wrote:
Hi,

Yes, I believe that would be the case currently. (But please alert if I'm mistaken)

Thanks,

Bryan

> On Dec 15, 2016, at 1:07 PM, RedRaven <[email protected]> wrote:
>
> Thanks for the thorough reply Bryan. Does this imply that if I start the columndatasource with a list it should stay consistent?
>
> There isnt a particular reason why this should stay in array format.
>
> On Thursday, 15 December 2016 13:41:03 UTC-5, RedRaven wrote:
> Hi All,
>
> I am working on a project that streams data to a ColumnDataSource and then uses the data in the column data source to create a streaming plot. Because this is not a public project I dont think I can reveal the whole code but I will describe what is happening here.
>
> I first create a ColumnDataSource object:
> source = ColumnDataSource(dict(
> time=np.zeros(rollingWindowSize), roomTemp=np.zeros(rollingWindowSize)))
>
>
> I then record data from a temperature sensor and update the ColumnDataSource as (note that im initializing the new data dictionary with an old value for the room temperature):
>
> newData = dict(time=[t], roomTemp=[RT_store[-1]])
> newData['roomTemp'] = [RT_store[-1]]
> source.stream(newData,rollover=rollingWindowSize)
>
> I then begin monitoring the plot. I noticed then that within the first 5 minutes of running the code the plot resets several times and then stabilizes for the remainder of the running time. Digging through the code I found that what happens is the data type of the source data is changing.
>
> Before (ndarray):
> [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
> 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
> 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
> 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
> 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
> 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
>
> After (list):
> [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.7, 22.7, 22.7]
>
>
> In essence, type(source.data['roomTemp']) changes from numpy array to list. When this happens the list is initially empty so the plot is reset. I'm not sure why the data type is reset, however, I have tracked the change to the stream function in Bokeh sources. Here the value of the oldkeys variable changes to unicode strings when the data type changes to list.
>
> def stream(self, new_data, rollover=None):
> import numpy as np
>
> newkeys = set(new_data.keys())
> oldkeys = set(self.data.keys())
>
> print "newkeys " +str(newkeys)
> print "oldkeys "+str(oldkeys)
>
> if newkeys != oldkeys:
> missing = oldkeys - newkeys
> extra = newkeys - oldkeys
> if missing and extra:
> raise ValueError("Must stream updates to all existing columns (missing: %s, extra: %s)" % (", ".join(sorted(missing)), ", ".join(sorted(extra))))
> elif missing:
> raise ValueError("Must stream updates to all existing columns (missing: %s)" % ", ".join(sorted(missing)))
> else:
> raise ValueError("Must stream updates to all existing columns (extra: %s)" % ", ".join(sorted(extra)))
>
> lengths = set()
> for x in new_data.values():
> if isinstance(x, np.ndarray):
> if len(x.shape) != 1:
> raise ValueError("stream(...) only supports 1d sequences, got ndarray with size %r" % (x.shape,))
> lengths.add(x.shape[0])
> else:
> lengths.add(len(x))
>
> if len(lengths) > 1:
> raise ValueError("All streaming column updates must be the same length")
>
> print "before "+str(new_data.values())
> self.data._stream(self.document, self, new_data, rollover)
>
> newkeys set(['time', 'roomTemp'])
> oldkeys set([u'time', u'roomTemp'])
>
> Does anyone know why this is happening? If not, where can I find this function to continue troubleshooting the code:
> self.data._stream(self.document, self, new_data, rollover)
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To post to this group, send email to [email protected].
> To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/e6c8b2da-eede-498d-86bd-9f2b74f8280c%40continuum.io.
> For more options, visit https://groups.google.com/a/continuum.io/d/optout.

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/c86c3c21-56ed-407b-8dcb-905e0e83dca2%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Thanks Bryan, I will look into this.

···

On Thursday, 15 December 2016 14:50:19 UTC-5, Bryan Van de ven wrote:

Ok,

There might be a separate bug here to look into, If you can I’d suggest making a new GH issue with a code sample that can be run. My focus for 0.12.5 will be some of these protocol-related issues so this would be good to consider.

Bryan

On Dec 15, 2016, at 1:34 PM, RedRaven [email protected] wrote:

Thanks Bryan. I tried the change. Ie. I initialized the CDS with lists:

source = ColumnDataSource(dict(

time=[0]*rollingWindowSize, roomTemp=[0]*rollingWindowSize))

I also updated the values that are being input to the CDS to be float32. Now the type of data in the CDS is consistently a list, however, this list is still being cleared a few minutes into the run. Interestingly enough it eventually stabilizes and does not clear again.

On Thursday, 15 December 2016 14:12:32 UTC-5, Bryan Van de ven wrote:

Hi,

Yes, I believe that would be the case currently. (But please alert if I’m mistaken)

Thanks,

Bryan

On Dec 15, 2016, at 1:07 PM, RedRaven [email protected] wrote:

Thanks for the thorough reply Bryan. Does this imply that if I start the columndatasource with a list it should stay consistent?

There isnt a particular reason why this should stay in array format.

On Thursday, 15 December 2016 13:41:03 UTC-5, RedRaven wrote:
Hi All,

I am working on a project that streams data to a ColumnDataSource and then uses the data in the column data source to create a streaming plot. Because this is not a public project I dont think I can reveal the whole code but I will describe what is happening here.

I first create a ColumnDataSource object:
source = ColumnDataSource(dict(
time=np.zeros(rollingWindowSize), roomTemp=np.zeros(rollingWindowSize)))

I then record data from a temperature sensor and update the ColumnDataSource as (note that im initializing the new data dictionary with an old value for the room temperature):

newData = dict(time=[t], roomTemp=[RT_store[-1]])
newData[‘roomTemp’] = [RT_store[-1]]
source.stream(newData,rollover=rollingWindowSize)

I then begin monitoring the plot. I noticed then that within the first 5 minutes of running the code the plot resets several times and then stabilizes for the remainder of the running time. Digging through the code I found that what happens is the data type of the source data is changing.

Before (ndarray):
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

After (list):
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.4, 22.7, 22.7, 22.7]

In essence, type(source.data[‘roomTemp’]) changes from numpy array to list. When this happens the list is initially empty so the plot is reset. I’m not sure why the data type is reset, however, I have tracked the change to the stream function in Bokeh sources. Here the value of the oldkeys variable changes to unicode strings when the data type changes to list.

def stream(self, new_data, rollover=None):
import numpy as np

newkeys = set(new_data.keys())
oldkeys = set(self.data.keys())

print "newkeys " +str(newkeys)
print "oldkeys "+str(oldkeys)

if newkeys != oldkeys:
    missing = oldkeys - newkeys
    extra = newkeys - oldkeys
    if missing and extra:
        raise ValueError("Must stream updates to all existing columns (missing: %s, extra: %s)" % (", ".join(sorted(missing)), ", ".join(sorted(extra))))
    elif missing:
        raise ValueError("Must stream updates to all existing columns (missing: %s)" % ", ".join(sorted(missing)))
    else:
        raise ValueError("Must stream updates to all existing columns (extra: %s)" % ", ".join(sorted(extra)))

lengths = set()
for x in new_data.values():
    if isinstance(x, np.ndarray):
        if len(x.shape) != 1:
            raise ValueError("stream(...) only supports 1d sequences, got ndarray with size %r" % (x.shape,))
        lengths.add(x.shape[0])
    else:
        lengths.add(len(x))

if len(lengths) > 1:
    raise ValueError("All streaming column updates must be the same length")

print "before "+str(new_data.values())
self.data._stream(self.document, self, new_data, rollover)

newkeys set([‘time’, ‘roomTemp’])
oldkeys set([u’time’, u’roomTemp’])

Does anyone know why this is happening? If not, where can I find this function to continue troubleshooting the code:
self.data._stream(self.document, self, new_data, rollover)


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/e6c8b2da-eede-498d-86bd-9f2b74f8280c%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/c86c3c21-56ed-407b-8dcb-905e0e83dca2%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.