DataTable creates a new 'index' column

Hi Guys,

I have a Dashboard where I stream my data from a ColumnDataSource object.

I created a DataTable for seeing the raw data coming in, however the DataTable creates a new ‘index’ Column which I cannot stream to.

source = ColumnDataSource(data={‘cid’: ,

‘ip_dst’: ,

‘ip_src’: ,

‘sid’: ,

‘sig_class_name’: ,

‘sig_gid’: ,

‘sig_id’: ,

‘sig_name’: ,

‘sig_priority’: ,

‘sig_rev’: ,

‘sig_sid’: ,

‘signature’: ,

‘tcp_dport’: ,

‘tcp_sport’: ,

‘timestamp’: })

datatable = DataTable(source=source, fit_columns=True, width=1000, height=400, columns=[TableColumn(field=‘cid’, title=‘cid’),

TableColumn(field=‘ip_dst’, title=‘ip_dst’),

TableColumn(field=‘ip_src’, title=‘ip_src’),

TableColumn(field=‘sid’, title=‘sid’),

TableColumn(field=‘sig_class_name’, title=‘class_name’),

TableColumn(field=‘sig_gid’, title=‘sig_gid’),

TableColumn(field=‘sig_id’, title=‘sig_id’),

TableColumn(field=‘sig_name’, title=‘sig_name’),

TableColumn(field=‘sig_priority’, title=‘priority’),

TableColumn(field=‘sig_rev’, title=‘sig_rev’),

TableColumn(field=‘sig_sid’, title=‘sig_sid’),

TableColumn(field=‘signature’, title=‘sig_num’),

TableColumn(field=‘tcp_dport’, title=‘dport’),

TableColumn(field=‘tcp_sport’, title=‘sport’),

TableColumn(field=‘timestamp’, title=‘timestamp’)])

def update():

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id ’

‘LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;’,

engine).drop(‘sig_class_id’, axis=1)

df2 = pd.read_sql_query(‘SELECT ip_src,ip_dst FROM iphdr;’, engine)

df3 = pd.read_sql_query(‘SELECT tcp_sport, tcp_dport FROM tcphdr;’, engine)

df4 = pd.concat([df, df2, df3], axis=1, verify_integrity=True).to_dict(orient=‘list’)

source.stream(df4, 300)

``

Im unsure of where the ‘index’ column that the DataTable contains comes from.

As I do not have a columndatasource stream that contains an ‘index’ column, the application breaks with 'ValueError(‘Must stream updates to all existing columns (missing: index)’

bokeh version = 0.12.4

pandas version = 0.19.2

Cheers,

Sean Cruikshank

Hey Sean,

I tried to recreate the error and could not. I created a data table using a single columned source and streamed values to it from a dataframe using the to_dict() method. This didn’t cause an ‘index’ column to be created.

from bokeh.models import ColumnDataSource, DataTable, TableColumn
source = ColumnDataSource(dict(x=[]))
source.data
# {'x': []}
table = DataTable(source=source, fit_columns=True, width=1000, height=400, columns=[TableColumn(field='x', title='x')])
source.data
# {'x': []}
table.source.data
# {'x': []}
import pandas as pd
df = pd.DataFrame(dict(x=['a','b', 'c']))
source.stream(df.to_dict('list'))
source.data
# {'x': ['a', 'b', 'c']}
source.stream(df.to_dict('list'), 1)
source.data
# {'x': ['c']}
source.stream(df.to_dict('list'), 3)
source.data
# {'x': ['a', 'b', 'c']}
source.stream(df.to_dict('list'), 5)
source.data
# {'x': ['b', 'c', 'a', 'b', 'c']}
table.source.data
# {'x': ['b', 'c', 'a', 'b', 'c']}

This makes me think that the issue is happening somewhere else in your code. If you are still having trouble, you might want to try reducing your code to a minimal, reproducible example. This should help you figure out where the index column is being introduced to your data column source and will make it easier for others to help.

Regards,
Tyler

···

On Wed, Mar 22, 2017 at 11:32 AM, MrShookshank [email protected] wrote:

Hi Guys,

I have a Dashboard where I stream my data from a ColumnDataSource object.

I created a DataTable for seeing the raw data coming in, however the DataTable creates a new ‘index’ Column which I cannot stream to.

source = ColumnDataSource(data={‘cid’: ,

‘ip_dst’: ,

‘ip_src’: ,

‘sid’: ,

‘sig_class_name’: ,

‘sig_gid’: ,

‘sig_id’: ,

‘sig_name’: ,

‘sig_priority’: ,

‘sig_rev’: ,

‘sig_sid’: ,

‘signature’: ,

‘tcp_dport’: ,

‘tcp_sport’: ,

‘timestamp’: })

datatable = DataTable(source=source, fit_columns=True, width=1000, height=400, columns=[TableColumn(field=‘cid’, title=‘cid’),

TableColumn(field=‘ip_dst’, title=‘ip_dst’),

TableColumn(field=‘ip_src’, title=‘ip_src’),

TableColumn(field=‘sid’, title=‘sid’),

TableColumn(field=‘sig_class_name’, title=‘class_name’),

TableColumn(field=‘sig_gid’, title=‘sig_gid’),

TableColumn(field=‘sig_id’, title=‘sig_id’),

TableColumn(field=‘sig_name’, title=‘sig_name’),

TableColumn(field=‘sig_priority’, title=‘priority’),

TableColumn(field=‘sig_rev’, title=‘sig_rev’),

TableColumn(field=‘sig_sid’, title=‘sig_sid’),

TableColumn(field=‘signature’, title=‘sig_num’),

TableColumn(field=‘tcp_dport’, title=‘dport’),

TableColumn(field=‘tcp_sport’, title=‘sport’),

TableColumn(field=‘timestamp’, title=‘timestamp’)])

def update():

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id ’

‘LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;’,

engine).drop(‘sig_class_id’, axis=1)

df2 = pd.read_sql_query(‘SELECT ip_src,ip_dst FROM iphdr;’, engine)

df3 = pd.read_sql_query(‘SELECT tcp_sport, tcp_dport FROM tcphdr;’, engine)

df4 = pd.concat([df, df2, df3], axis=1, verify_integrity=True).to_dict(orient=‘list’)

source.stream(df4, 300)

``

Im unsure of where the ‘index’ column that the DataTable contains comes from.

As I do not have a columndatasource stream that contains an ‘index’ column, the application breaks with 'ValueError(‘Must stream updates to all existing columns (missing: index)’

bokeh version = 0.12.4

pandas version = 0.19.2

Cheers,

Sean Cruikshank

You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/46ed5dae-9b9b-4060-a5e6-7023a2a0225b%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Hi Tyler,

This sounds similar to an issue I previously posted - see here.

To test this out I added show(WidgetBox(table)) to your code and ran it. I got the same problem as before - it looks fine until you zoom in or out in the browser (Firefox in my case) when an extra initial column appears named only ‘#’. Does this have any relationship to the extra ‘index’ column Sean is seeing?

Thanks,
Marcus.

Hi,

There was a time when automatically adding an index column made the most sense, but it may be time to re-think this, or at least make an option to disable this when it is not desired. It would be great if one f you could make a GitHub issue to discuss this further.

Thanks,

Bryan

···

On Mar 31, 2017, at 02:07, Marcus Donnelly <[email protected]> wrote:

Hi Tyler,

This sounds similar to an issue I previously posted - see here.

To test this out I added show(WidgetBox(table)) to your code and ran it. I got the same problem as before - it looks fine until you zoom in or out in the browser (Firefox in my case) when an extra initial column appears named only '#'. Does this have any relationship to the extra 'index' column Sean is seeing?

Thanks,
Marcus.

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/8b07f54d-02f5-4d23-b58a-baeaa8576064%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Hi Bryan,

I’ve just raised an issue on the extra column appearing.

Thanks,
Marcus.