When declaring multiple sets of data (eg. multiple x=[0,1,2,3], y=[0,1,4,9] pairs) in JSON it seems there are two possible approaches.
In one (eg. anscombe.py) you declare one ColumnDataSource which contains all data arrays.
In the other (eg. stocks.py) you declare a ColumnDataSource for each x,y pair.
Which is the preferred approach? And most importantly are there situations where only one approach is valid?
I know one case is multiple lines on one plot, and the other is multiple plots, but at least in principle it would be possible to use either approach in either situation. Am I missing something?
On Fri, Jun 20, 2014 at 4:58 PM, Samuel Colvin [email protected] wrote:
When declaring multiple sets of data (eg. multiple x=[0,1,2,3], y=[0,1,4,9] pairs) in JSON it seems there are two possible approaches.
In one (eg. anscombe.py) you declare one ColumnDataSource which contains all data arrays.
In the other (eg. stocks.py) you declare a ColumnDataSource for each x,y pair.
Which is the preferred approach? And most importantly are there situations where only one approach is valid?
I know one case is multiple lines on one plot, and the other is multiple plots, but at least in principle it would be possible to use either approach in either situation. Am I missing something?
There rule you should follow is that one data source should contain arrays of equal length. bokehjs assumes this to support scalar values, i.e. Circle(x=[1,2,3], y=[1,2,3], radius=0.5) (radius being the interesting part here). So, you will have to use a separate data source for each glyph in most cases.
Mateusz
–
You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
As Mateusz mentioned, the implicit assumption on columns in a given data source is that they have the same size. This could probably be stated more often or clearly in the docs. Some context might be helpful as well. The main reason for this is that data sources also store a current selection, and the selection indices can only make sense across different columns if the columns have the same length. This was probably a convenient thing to do early on but I think we may have to revisit this model in the future.
Bryan
···
On Jun 20, 2014, at 9:58 AM, Samuel Colvin <[email protected]> wrote:
When declaring multiple sets of data (eg. multiple x=[0,1,2,3], y=[0,1,4,9] pairs) in JSON it seems there are two possible approaches.
In one (eg. anscombe.py) you declare one ColumnDataSource which contains all data arrays.
In the other (eg. stocks.py) you declare a ColumnDataSource for each x,y pair.
Which is the preferred approach? And most importantly are there situations where only one approach is valid?
re: selections - I would love to see us tackle that ASAP, a better selection model would enable much better stuff with serverside data
re: data sources - Is there a reason not to constrain column data sources to having the same column length? Personally i would like to do that, and just throw errors on construction if it's violated
···
On 06/20/2014 11:45 AM, Bryan Van de Ven wrote:
On Jun 20, 2014, at 9:58 AM, Samuel Colvin <[email protected]> wrote:
When declaring multiple sets of data (eg. multiple x=[0,1,2,3], y=[0,1,4,9] pairs) in JSON it seems there are two possible approaches.
In one (eg. anscombe.py) you declare one ColumnDataSource which contains all data arrays.
In the other (eg. stocks.py) you declare a ColumnDataSource for each x,y pair.
Which is the preferred approach? And most importantly are there situations where only one approach is valid?
As Mateusz mentioned, the implicit assumption on columns in a given data source is that they have the same size. This could probably be stated more often or clearly in the docs. Some context might be helpful as well. The main reason for this is that data sources also store a current selection, and the selection indices can only make sense across different columns if the columns have the same length. This was probably a convenient thing to do early on but I think we may have to revisit this model in the future.
Hugo, can you write up your thoughts on selections on a GH wiki page or similar when you get a chance? It seems maybe we should make a proper selection object that is decoupled from datasources but I would like to see your thoughts on this.
Regarding hard checks on column data source columns, I think that is fine, or at the very least, loud warnings.
re: selections - I would love to see us tackle that ASAP, a better selection model would enable much better stuff with serverside data
re: data sources - Is there a reason not to constrain column data sources to having the same column length? Personally i would like to do that, and just throw errors on construction if it's violated
On 06/20/2014 11:45 AM, Bryan Van de Ven wrote:
On Jun 20, 2014, at 9:58 AM, Samuel Colvin <[email protected]> wrote:
When declaring multiple sets of data (eg. multiple x=[0,1,2,3], y=[0,1,4,9] pairs) in JSON it seems there are two possible approaches.
In one (eg. anscombe.py) you declare one ColumnDataSource which contains all data arrays.
In the other (eg. stocks.py) you declare a ColumnDataSource for each x,y pair.
Which is the preferred approach? And most importantly are there situations where only one approach is valid?
As Mateusz mentioned, the implicit assumption on columns in a given data source is that they have the same size. This could probably be stated more often or clearly in the docs. Some context might be helpful as well. The main reason for this is that data sources also store a current selection, and the selection indices can only make sense across different columns if the columns have the same length. This was probably a convenient thing to do early on but I think we may have to revisit this model in the future.
I missed 3 the last hours knocking this implicit assumption about equality of columns lengths. At least, we strongly need to warn about the inequality status…
···
On Sun, Jun 22, 2014 at 2:49 PM, Bryan Van de Ven [email protected] wrote:
Hugo, can you write up your thoughts on selections on a GH wiki page or similar when you get a chance? It seems maybe we should make a proper selection object that is decoupled from datasources but I would like to see your thoughts on this.
Regarding hard checks on column data source columns, I think that is fine, or at the very least, loud warnings.
re: selections - I would love to see us tackle that ASAP, a better selection model would enable much better stuff with serverside data
re: data sources - Is there a reason not to constrain column data sources to having the same column length? Personally i would like to do that, and just throw errors on construction if it’s violated
When declaring multiple sets of data (eg. multiple x=[0,1,2,3], y=[0,1,4,9] pairs) in JSON it seems there are two possible approaches.
In one (eg. anscombe.py) you declare one ColumnDataSource which contains all data arrays.
In the other (eg. stocks.py) you declare a ColumnDataSource for each x,y pair.
Which is the preferred approach? And most importantly are there situations where only one approach is valid?
As Mateusz mentioned, the implicit assumption on columns in a given data source is that they have the same size. This could probably be stated more often or clearly in the docs. Some context might be helpful as well. The main reason for this is that data sources also store a current selection, and the selection indices can only make sense across different columns if the columns have the same length. This was probably a convenient thing to do early on but I think we may have to revisit this model in the future.
Bryan
–
You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
the last 3 hours… not “3 the last hours”… jaja… 3:30am here… better going to bed
···
On Tue, Jun 24, 2014 at 3:34 AM, Damian Avila [email protected] wrote:
I missed 3 the last hours knocking this implicit assumption about equality of columns lengths. At least, we strongly need to warn about the inequality status…
On Sun, Jun 22, 2014 at 2:49 PM, Bryan Van de Ven [email protected] wrote:
Hugo, can you write up your thoughts on selections on a GH wiki page or similar when you get a chance? It seems maybe we should make a proper selection object that is decoupled from datasources but I would like to see your thoughts on this.
Regarding hard checks on column data source columns, I think that is fine, or at the very least, loud warnings.
re: selections - I would love to see us tackle that ASAP, a better selection model would enable much better stuff with serverside data
re: data sources - Is there a reason not to constrain column data sources to having the same column length? Personally i would like to do that, and just throw errors on construction if it’s violated
When declaring multiple sets of data (eg. multiple x=[0,1,2,3], y=[0,1,4,9] pairs) in JSON it seems there are two possible approaches.
In one (eg. anscombe.py) you declare one ColumnDataSource which contains all data arrays.
In the other (eg. stocks.py) you declare a ColumnDataSource for each x,y pair.
Which is the preferred approach? And most importantly are there situations where only one approach is valid?
As Mateusz mentioned, the implicit assumption on columns in a given data source is that they have the same size. This could probably be stated more often or clearly in the docs. Some context might be helpful as well. The main reason for this is that data sources also store a current selection, and the selection indices can only make sense across different columns if the columns have the same length. This was probably a convenient thing to do early on but I think we may have to revisit this model in the future.
Bryan
–
You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
On Tue, Jun 24, 2014 at 8:34 AM, Damian Avila [email protected] wrote:
I missed 3 the last hours knocking this implicit assumption about equality of columns lengths. At least, we strongly need to warn about the inequality status…
We shouldn’t warn. This should be a validation error, otherwise people will simply ignore/overlook this warning and end up in the same place. Optimally, this should be fixed on bokehjs level, so that scalars are expanded on glyph level, not on data source level, or even better not expanded at all, but we should use generators or some other abstraction over arrays and scalars.
Mateusz
–
You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
On Sun, Jun 22, 2014 at 2:49 PM, Bryan Van de Ven [email protected] wrote:
Hugo, can you write up your thoughts on selections on a GH wiki page or similar when you get a chance? It seems maybe we should make a proper selection object that is decoupled from datasources but I would like to see your thoughts on this.
Regarding hard checks on column data source columns, I think that is fine, or at the very least, loud warnings.
re: selections - I would love to see us tackle that ASAP, a better selection model would enable much better stuff with serverside data
re: data sources - Is there a reason not to constrain column data sources to having the same column length? Personally i would like to do that, and just throw errors on construction if it’s violated
When declaring multiple sets of data (eg. multiple x=[0,1,2,3], y=[0,1,4,9] pairs) in JSON it seems there are two possible approaches.
In one (eg. anscombe.py) you declare one ColumnDataSource which contains all data arrays.
In the other (eg. stocks.py) you declare a ColumnDataSource for each x,y pair.
Which is the preferred approach? And most importantly are there situations where only one approach is valid?
As Mateusz mentioned, the implicit assumption on columns in a given data source is that they have the same size. This could probably be stated more often or clearly in the docs. Some context might be helpful as well. The main reason for this is that data sources also store a current selection, and the selection indices can only make sense across different columns if the columns have the same length. This was probably a convenient thing to do early on but I think we may have to revisit this model in the future.
Bryan
–
You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
Yes I think we should make optimizing scalers a task for 0.6. I think currently we should error in python and console log in bokehjs.
···
On Tue, Jun 24, 2014 at 8:34 AM, Damian Avila [email protected] wrote:
I missed 3 the last hours knocking this implicit assumption about equality of columns lengths. At least, we strongly need to warn about the inequality status…
We shouldn’t warn. This should be a validation error, otherwise people will simply ignore/overlook this warning and end up in the same place. Optimally, this should be fixed on bokehjs level, so that scalars are expanded on glyph level, not on data source level, or even better not expanded at all, but we should use generators or some other abstraction over arrays and scalars.
Mateusz
–
You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
On Sun, Jun 22, 2014 at 2:49 PM, Bryan Van de Ven [email protected] wrote:
Hugo, can you write up your thoughts on selections on a GH wiki page or similar when you get a chance? It seems maybe we should make a proper selection object that is decoupled from datasources but I would like to see your thoughts on this.
Regarding hard checks on column data source columns, I think that is fine, or at the very least, loud warnings.
re: selections - I would love to see us tackle that ASAP, a better selection model would enable much better stuff with serverside data
re: data sources - Is there a reason not to constrain column data sources to having the same column length? Personally i would like to do that, and just throw errors on construction if it’s violated
When declaring multiple sets of data (eg. multiple x=[0,1,2,3], y=[0,1,4,9] pairs) in JSON it seems there are two possible approaches.
In one (eg. anscombe.py) you declare one ColumnDataSource which contains all data arrays.
In the other (eg. stocks.py) you declare a ColumnDataSource for each x,y pair.
Which is the preferred approach? And most importantly are there situations where only one approach is valid?
As Mateusz mentioned, the implicit assumption on columns in a given data source is that they have the same size. This could probably be stated more often or clearly in the docs. Some context might be helpful as well. The main reason for this is that data sources also store a current selection, and the selection indices can only make sense across different columns if the columns have the same length. This was probably a convenient thing to do early on but I think we may have to revisit this model in the future.
Bryan
–
You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].