ColumnDataSource from_df() issue

Hi fellow bokeh users,

I have the following code:

’’'

from bokeh.plotting import ColumnDataSource

from sqlalchemy import create_engine

from config import *

import pandas as pd

engine = create_engine(mysql[‘connection’], pool_size=20, max_overflow=0)

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id '

’LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;’, engine)

source = ColumnDataSource(df)

print(source)

’’'

the dataframe the query creates prints fine. However when I attempt to create the ColumnDataSource, i get this error;

“AttributeError: ‘DataFrame’ object has no attribute ‘tolist’”

Any help on this issue would be appreciated hugely.

Cheers.

Hey MrShookshank ,
Try

print(source.data)

it should work and to access to the data use .get(’ NameOfCol’) method.

you get all the method possible with :dir(source) .

hop it helps

Rémi

···

On Wednesday, February 22, 2017 at 2:16:50 PM UTC+1, MrShookshank wrote:

Hi fellow bokeh users,

I have the following code:

‘’’

from bokeh.plotting import ColumnDataSource

from sqlalchemy import create_engine

from config import *

import pandas as pd

engine = create_engine(mysql[‘connection’], pool_size=20, max_overflow=0)

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id ’

‘LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;’, engine)

source = ColumnDataSource(df)

print(source)

‘’’

the dataframe the query creates prints fine. However when I attempt to create the ColumnDataSource, i get this error;

“AttributeError: ‘DataFrame’ object has no attribute ‘tolist’”

Any help on this issue would be appreciated hugely.

Cheers.

unfortunately that did not seem to work.

the error is thrown specifically on this line “source=ColumnDataSource(df)”

which seems to be the conversion from the dataframe to columndatasource.

As I said, print works with the dataframe but not ColumnDataSource.

···

On Wednesday, February 22, 2017 at 1:39:55 PM UTC, Rémi Toudic wrote:

Hey MrShookshank ,
Try

print(source.data)

it should work and to access to the data use .get(’ NameOfCol’) method.

you get all the method possible with :dir(source) .

hop it helps

Rémi

On Wednesday, February 22, 2017 at 2:16:50 PM UTC+1, MrShookshank wrote:

Hi fellow bokeh users,

I have the following code:

‘’’

from bokeh.plotting import ColumnDataSource

from sqlalchemy import create_engine

from config import *

import pandas as pd

engine = create_engine(mysql[‘connection’], pool_size=20, max_overflow=0)

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id ’

‘LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;’, engine)

source = ColumnDataSource(df)

print(source)

‘’’

the dataframe the query creates prints fine. However when I attempt to create the ColumnDataSource, i get this error;

“AttributeError: ‘DataFrame’ object has no attribute ‘tolist’”

Any help on this issue would be appreciated hugely.

Cheers.

Try to pass just a column of your " df " and check the type before, it must be a series or at least a list/ array if not convert to list before passing it

···

On Wednesday, February 22, 2017 at 3:07:11 PM UTC+1, MrShookshank wrote:

unfortunately that did not seem to work.

the error is thrown specifically on this line “source=ColumnDataSource(df)”

which seems to be the conversion from the dataframe to columndatasource.

As I said, print works with the dataframe but not ColumnDataSource.

On Wednesday, February 22, 2017 at 1:39:55 PM UTC, Rémi Toudic wrote:

Hey MrShookshank ,
Try

print(source.data)

it should work and to access to the data use .get(’ NameOfCol’) method.

you get all the method possible with :dir(source) .

hop it helps

Rémi

On Wednesday, February 22, 2017 at 2:16:50 PM UTC+1, MrShookshank wrote:

Hi fellow bokeh users,

I have the following code:

‘’’

from bokeh.plotting import ColumnDataSource

from sqlalchemy import create_engine

from config import *

import pandas as pd

engine = create_engine(mysql[‘connection’], pool_size=20, max_overflow=0)

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id ’

‘LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;’, engine)

source = ColumnDataSource(df)

print(source)

‘’’

the dataframe the query creates prints fine. However when I attempt to create the ColumnDataSource, i get this error;

“AttributeError: ‘DataFrame’ object has no attribute ‘tolist’”

Any help on this issue would be appreciated hugely.

Cheers.

Thanks Remi,

I can turn it into a list using .values.tolist() which does the trick,

however later in my program I do a ‘pd.concat’ to merge a couple of dataframes and obviously this does not accept list as input.

have you any ideas around this, safely merging lists with a specified index etc ?

Thanks again.

···

On Wednesday, February 22, 2017 at 2:30:59 PM UTC, Rémi Toudic wrote:

Try to pass just a column of your " df " and check the type before, it must be a series or at least a list/ array if not convert to list before passing it
On Wednesday, February 22, 2017 at 3:07:11 PM UTC+1, MrShookshank wrote:

unfortunately that did not seem to work.

the error is thrown specifically on this line “source=ColumnDataSource(df)”

which seems to be the conversion from the dataframe to columndatasource.

As I said, print works with the dataframe but not ColumnDataSource.

On Wednesday, February 22, 2017 at 1:39:55 PM UTC, Rémi Toudic wrote:

Hey MrShookshank ,
Try

print(source.data)

it should work and to access to the data use .get(’ NameOfCol’) method.

you get all the method possible with :dir(source) .

hop it helps

Rémi

On Wednesday, February 22, 2017 at 2:16:50 PM UTC+1, MrShookshank wrote:

Hi fellow bokeh users,

I have the following code:

‘’’

from bokeh.plotting import ColumnDataSource

from sqlalchemy import create_engine

from config import *

import pandas as pd

engine = create_engine(mysql[‘connection’], pool_size=20, max_overflow=0)

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id ’

‘LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;’, engine)

source = ColumnDataSource(df)

print(source)

‘’’

the dataframe the query creates prints fine. However when I attempt to create the ColumnDataSource, i get this error;

“AttributeError: ‘DataFrame’ object has no attribute ‘tolist’”

Any help on this issue would be appreciated hugely.

Cheers.

My first go would be to gather your lists in a list and to convert it as dataframe. Stackoverflow a little 100% a solution in 10 min search.

···

On Wednesday, February 22, 2017 at 3:42:27 PM UTC+1, MrShookshank wrote:

Thanks Remi,

I can turn it into a list using .values.tolist() which does the trick,

however later in my program I do a ‘pd.concat’ to merge a couple of dataframes and obviously this does not accept list as input.

have you any ideas around this, safely merging lists with a specified index etc ?

Thanks again.

On Wednesday, February 22, 2017 at 2:30:59 PM UTC, Rémi Toudic wrote:

Try to pass just a column of your " df " and check the type before, it must be a series or at least a list/ array if not convert to list before passing it
On Wednesday, February 22, 2017 at 3:07:11 PM UTC+1, MrShookshank wrote:

unfortunately that did not seem to work.

the error is thrown specifically on this line “source=ColumnDataSource(df)”

which seems to be the conversion from the dataframe to columndatasource.

As I said, print works with the dataframe but not ColumnDataSource.

On Wednesday, February 22, 2017 at 1:39:55 PM UTC, Rémi Toudic wrote:

Hey MrShookshank ,
Try

print(source.data)

it should work and to access to the data use .get(’ NameOfCol’) method.

you get all the method possible with :dir(source) .

hop it helps

Rémi

On Wednesday, February 22, 2017 at 2:16:50 PM UTC+1, MrShookshank wrote:

Hi fellow bokeh users,

I have the following code:

‘’’

from bokeh.plotting import ColumnDataSource

from sqlalchemy import create_engine

from config import *

import pandas as pd

engine = create_engine(mysql[‘connection’], pool_size=20, max_overflow=0)

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id ’

‘LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;’, engine)

source = ColumnDataSource(df)

print(source)

‘’’

the dataframe the query creates prints fine. However when I attempt to create the ColumnDataSource, i get this error;

“AttributeError: ‘DataFrame’ object has no attribute ‘tolist’”

Any help on this issue would be appreciated hugely.

Cheers.

Please note: you have not specified the version of Bokeh, or the version of Pandas. It's always advised to provide this kind of information when asking for help.

There is something unexpected or unusual about this DataFrame. You can look at the CDS code that coverts DataFames, it's only a few lines of code:

  https://github.com/bokeh/bokeh/blob/master/bokeh/models/sources.py#L112-L133

As you can see, it calls .tolist on the columns of the DataFrame, which is normally fine. Is yours somehow nested? It's really hard to say anything more specific unless you can provide code and data to reproduce the problem.

Thanks,

Bryan

···

On Feb 22, 2017, at 07:16, MrShookshank <[email protected]> wrote:

Hi fellow bokeh users,

I have the following code:

'''
from bokeh.plotting import ColumnDataSource
from sqlalchemy import create_engine
from config import *
import pandas as pd

engine = create_engine(mysql['connection'], pool_size=20, max_overflow=0)

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id '
                                      'LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;', engine)

source = ColumnDataSource(df)

print(source)
'''

the dataframe the query creates prints fine. However when I attempt to create the ColumnDataSource, i get this error;

"AttributeError: 'DataFrame' object has no attribute 'tolist'"

Any help on this issue would be appreciated hugely.

Cheers.

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/001de17b-2115-487c-bbbf-d07f2029cde7%40continuum.io\.
For more options, visit https://groups.google.com/a/continuum.io/d/optout\.

Hi Bryan,

firstly sorry for the lack of information,

bokeh version = 0.12.4

pandas version = 0.19.2

My instinct was that the dataframe was somehow broken but unsure where.

My full code is;

from bokeh.plotting import ColumnDataSource

from sqlalchemy import create_engine

from config import *

import pandas as pd

engine = create_engine(mysql[‘connection’], pool_size=20, max_overflow=0)

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id ’

‘LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;’, engine).dropna()

df2 = pd.read_sql_query(‘SELECT ip_src,ip_dst FROM iphdr;’, engine)

df3 = pd.read_sql_query(‘SELECT tcp_sport, tcp_dport FROM tcphdr;’, engine)

df4 = pd.concat([df, df2, df3], axis=1)

source = ColumnDataSource(df4)

print(source)

I have also tried adding .values.tolist() to df4 which results in error: ValueError: expected an element of List(String), got seq with invalid items [0]

No pressure to spend to solving this for me, im just unsure whether its an issue in bokeh, pandas or just my own ignorance :smiley:

Im not sure how I could provide data as its within a MySQL database with 10k rows.

Thanks,

Sean

···

On Wednesday, February 22, 2017 at 3:10:03 PM UTC, Bryan Van de ven wrote:

Please note: you have not specified the version of Bokeh, or the version of Pandas. It’s always advised to provide this kind of information when asking for help.

There is something unexpected or unusual about this DataFrame. You can look at the CDS code that coverts DataFames, it’s only a few lines of code:

    [https://github.com/bokeh/bokeh/blob/master/bokeh/models/sources.py#L112-L133](https://github.com/bokeh/bokeh/blob/master/bokeh/models/sources.py#L112-L133)

As you can see, it calls .tolist on the columns of the DataFrame, which is normally fine. Is yours somehow nested? It’s really hard to say anything more specific unless you can provide code and data to reproduce the problem.

Thanks,

Bryan

On Feb 22, 2017, at 07:16, MrShookshank [email protected] wrote:

Hi fellow bokeh users,

I have the following code:

‘’’

from bokeh.plotting import ColumnDataSource

from sqlalchemy import create_engine

from config import *

import pandas as pd

engine = create_engine(mysql[‘connection’], pool_size=20, max_overflow=0)

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id ’

                                  'LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;', engine)

source = ColumnDataSource(df)

print(source)

‘’’

the dataframe the query creates prints fine. However when I attempt to create the ColumnDataSource, i get this error;

“AttributeError: ‘DataFrame’ object has no attribute ‘tolist’”

Any help on this issue would be appreciated hugely.

Cheers.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/001de17b-2115-487c-bbbf-d07f2029cde7%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Well like I said I think think there is something unexpected about the structure of your particular data frame, though offhand I don't know what. What is the output of df.info? E.g.

df.info()<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
sepal_length 150 non-null float64
sepal_width 150 non-null float64
petal_length 150 non-null float64
petal_width 150 non-null float64
species 150 non-null object
dtypes: float64(4), object(1)
memory usage: 5.9+ KB

Thanks,

Bryan

···

On Feb 22, 2017, at 10:25, MrShookshank <[email protected]> wrote:

Hi Bryan,

firstly sorry for the lack of information,

bokeh version = 0.12.4
pandas version = 0.19.2

My instinct was that the dataframe was somehow broken but unsure where.

My full code is;

from bokeh.plotting import ColumnDataSource
from sqlalchemy import create_engine
from config import *
import pandas as pd

engine = create_engine(mysql['connection'], pool_size=20, max_overflow=0)

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id '
                       'LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;', engine).dropna()

df2 = pd.read_sql_query('SELECT ip_src,ip_dst FROM iphdr;', engine)

df3 = pd.read_sql_query('SELECT tcp_sport, tcp_dport FROM tcphdr;', engine)

df4 = pd.concat([df, df2, df3], axis=1)

source = ColumnDataSource(df4)

print(source)

I have also tried adding .values.tolist() to df4 which results in error: ValueError: expected an element of List(String), got seq with invalid items [0]

No pressure to spend to solving this for me, im just unsure whether its an issue in bokeh, pandas or just my own ignorance :smiley:

Im not sure how I could provide data as its within a MySQL database with 10k rows.

Thanks,

Sean

On Wednesday, February 22, 2017 at 3:10:03 PM UTC, Bryan Van de ven wrote:
Please note: you have not specified the version of Bokeh, or the version of Pandas. It's always advised to provide this kind of information when asking for help.

There is something unexpected or unusual about this DataFrame. You can look at the CDS code that coverts DataFames, it's only a few lines of code:

        https://github.com/bokeh/bokeh/blob/master/bokeh/models/sources.py#L112-L133

As you can see, it calls .tolist on the columns of the DataFrame, which is normally fine. Is yours somehow nested? It's really hard to say anything more specific unless you can provide code and data to reproduce the problem.

Thanks,

Bryan

> On Feb 22, 2017, at 07:16, MrShookshank <[email protected]> wrote:
>
> Hi fellow bokeh users,
>
> I have the following code:
>
> '''
> from bokeh.plotting import ColumnDataSource
> from sqlalchemy import create_engine
> from config import *
> import pandas as pd
>
> engine = create_engine(mysql['connection'], pool_size=20, max_overflow=0)
>
> df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id '
> 'LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;', engine)
>
> source = ColumnDataSource(df)
>
> print(source)
> '''
>
> the dataframe the query creates prints fine. However when I attempt to create the ColumnDataSource, i get this error;
>
> "AttributeError: 'DataFrame' object has no attribute 'tolist'"
>
> Any help on this issue would be appreciated hugely.
>
> Cheers.
>
> --
> You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bokeh+un...@continuum.io.
> To post to this group, send email to bo...@continuum.io.
> To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/001de17b-2115-487c-bbbf-d07f2029cde7%40continuum.io\.
> For more options, visit https://groups.google.com/a/continuum.io/d/optout\.

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/f71272be-54ac-4fdc-99ca-0ffa46a257c7%40continuum.io\.
For more options, visit https://groups.google.com/a/continuum.io/d/optout\.

Hi Bryan,

Yes I should have checked my data previously, im quite new to bokeh and pandas in general.

Here is the info of each dataframe before handing to the ColumnDataSource.

df:

RangeIndex: 14063 entries, 0 to 14062

Data columns (total 13 columns):

sid 14063 non-null int64

cid 14063 non-null int64

signature 14063 non-null int64

timestamp 14063 non-null datetime64[ns]

sig_id 14063 non-null int64

sig_name 14063 non-null object

sig_class_id 14063 non-null int64

sig_priority 14063 non-null int64

sig_rev 14063 non-null int64

sig_sid 14063 non-null int64

sig_gid 14063 non-null int64

sig_class_id 14063 non-null int64

sig_class_name 14063 non-null object

dtypes: datetime64ns, int64(10), object(2)

memory usage: 1.4+ MB

df2:
RangeIndex: 14066 entries, 0 to 14065

Data columns (total 2 columns):

ip_src 14066 non-null int64

ip_dst 14066 non-null int64

dtypes: int64(2)

memory usage: 219.9 KB

df3:

RangeIndex: 14070 entries, 0 to 14069

Data columns (total 2 columns):

tcp_sport 14070 non-null int64

tcp_dport 14070 non-null int64

dtypes: int64(2)

memory usage: 219.9 KB

df4:

RangeIndex: 14074 entries, 0 to 14073

Data columns (total 17 columns):

sid 14074 non-null int64

cid 14074 non-null int64

signature 14074 non-null int64

timestamp 14074 non-null datetime64[ns]

sig_id 14074 non-null int64

sig_name 14074 non-null object

sig_class_id 14074 non-null int64

sig_priority 14074 non-null int64

sig_rev 14074 non-null int64

sig_sid 14074 non-null int64

sig_gid 14074 non-null int64

sig_class_id 14074 non-null int64

sig_class_name 14074 non-null object

ip_src 14074 non-null int64

ip_dst 14074 non-null int64

tcp_sport 14074 non-null int64

tcp_dport 14074 non-null int64

dtypes: datetime64ns, int64(14), object(2)

memory usage: 1.8+ MB

Looking at this now makes me think there are several issues to contend with as the rows do not seem to match.

Thanks again,

Sean

···

On Wednesday, February 22, 2017 at 4:38:44 PM UTC, Bryan Van de ven wrote:

Well like I said I think think there is something unexpected about the structure of your particular data frame, though offhand I don’t know what. What is the output of df.info? E.g.

df.info()<class ‘pandas.core.frame.DataFrame’>

RangeIndex: 150 entries, 0 to 149

Data columns (total 5 columns):

sepal_length 150 non-null float64

sepal_width 150 non-null float64

petal_length 150 non-null float64

petal_width 150 non-null float64

species 150 non-null object

dtypes: float64(4), object(1)

memory usage: 5.9+ KB

Thanks,

Bryan

On Feb 22, 2017, at 10:25, MrShookshank [email protected] wrote:

Hi Bryan,

firstly sorry for the lack of information,

bokeh version = 0.12.4

pandas version = 0.19.2

My instinct was that the dataframe was somehow broken but unsure where.

My full code is;

from bokeh.plotting import ColumnDataSource

from sqlalchemy import create_engine

from config import *

import pandas as pd

engine = create_engine(mysql[‘connection’], pool_size=20, max_overflow=0)

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id ’

                   'LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;', engine).dropna()

df2 = pd.read_sql_query(‘SELECT ip_src,ip_dst FROM iphdr;’, engine)

df3 = pd.read_sql_query(‘SELECT tcp_sport, tcp_dport FROM tcphdr;’, engine)

df4 = pd.concat([df, df2, df3], axis=1)

source = ColumnDataSource(df4)

print(source)

I have also tried adding .values.tolist() to df4 which results in error: ValueError: expected an element of List(String), got seq with invalid items [0]

No pressure to spend to solving this for me, im just unsure whether its an issue in bokeh, pandas or just my own ignorance :smiley:

Im not sure how I could provide data as its within a MySQL database with 10k rows.

Thanks,

Sean

On Wednesday, February 22, 2017 at 3:10:03 PM UTC, Bryan Van de ven wrote:

Please note: you have not specified the version of Bokeh, or the version of Pandas. It’s always advised to provide this kind of information when asking for help.

There is something unexpected or unusual about this DataFrame. You can look at the CDS code that coverts DataFames, it’s only a few lines of code:

    [https://github.com/bokeh/bokeh/blob/master/bokeh/models/sources.py#L112-L133](https://github.com/bokeh/bokeh/blob/master/bokeh/models/sources.py#L112-L133)

As you can see, it calls .tolist on the columns of the DataFrame, which is normally fine. Is yours somehow nested? It’s really hard to say anything more specific unless you can provide code and data to reproduce the problem.

Thanks,

Bryan

On Feb 22, 2017, at 07:16, MrShookshank [email protected] wrote:

Hi fellow bokeh users,

I have the following code:

‘’’
from bokeh.plotting import ColumnDataSource
from sqlalchemy import create_engine
from config import *
import pandas as pd

engine = create_engine(mysql[‘connection’], pool_size=20, max_overflow=0)

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id ’
‘LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;’, engine)

source = ColumnDataSource(df)

print(source)
‘’’

the dataframe the query creates prints fine. However when I attempt to create the ColumnDataSource, i get this error;

“AttributeError: ‘DataFrame’ object has no attribute ‘tolist’”

Any help on this issue would be appreciated hugely.

Cheers.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/001de17b-2115-487c-bbbf-d07f2029cde7%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/f71272be-54ac-4fdc-99ca-0ffa46a257c7%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Nothing looks immediately out of hand, except as you note, the rows don't line up. Another quick check: does passing each of the original data frames to CDS work? i.e. is it only the concat'ed data frame that fails?

Thanks,

Bryan

···

On Feb 22, 2017, at 10:53, MrShookshank <[email protected]> wrote:

Hi Bryan,

Yes I should have checked my data previously, im quite new to bokeh and pandas in general.

Here is the info of each dataframe before handing to the ColumnDataSource.

df:
RangeIndex: 14063 entries, 0 to 14062
Data columns (total 13 columns):
sid 14063 non-null int64
cid 14063 non-null int64
signature 14063 non-null int64
timestamp 14063 non-null datetime64[ns]
sig_id 14063 non-null int64
sig_name 14063 non-null object
sig_class_id 14063 non-null int64
sig_priority 14063 non-null int64
sig_rev 14063 non-null int64
sig_sid 14063 non-null int64
sig_gid 14063 non-null int64
sig_class_id 14063 non-null int64
sig_class_name 14063 non-null object
dtypes: datetime64[ns](1), int64(10), object(2)
memory usage: 1.4+ MB

df2:
RangeIndex: 14066 entries, 0 to 14065
Data columns (total 2 columns):
ip_src 14066 non-null int64
ip_dst 14066 non-null int64
dtypes: int64(2)
memory usage: 219.9 KB

df3:
RangeIndex: 14070 entries, 0 to 14069
Data columns (total 2 columns):
tcp_sport 14070 non-null int64
tcp_dport 14070 non-null int64
dtypes: int64(2)
memory usage: 219.9 KB

df4:
RangeIndex: 14074 entries, 0 to 14073
Data columns (total 17 columns):
sid 14074 non-null int64
cid 14074 non-null int64
signature 14074 non-null int64
timestamp 14074 non-null datetime64[ns]
sig_id 14074 non-null int64
sig_name 14074 non-null object
sig_class_id 14074 non-null int64
sig_priority 14074 non-null int64
sig_rev 14074 non-null int64
sig_sid 14074 non-null int64
sig_gid 14074 non-null int64
sig_class_id 14074 non-null int64
sig_class_name 14074 non-null object
ip_src 14074 non-null int64
ip_dst 14074 non-null int64
tcp_sport 14074 non-null int64
tcp_dport 14074 non-null int64
dtypes: datetime64[ns](1), int64(14), object(2)
memory usage: 1.8+ MB

Looking at this now makes me think there are several issues to contend with as the rows do not seem to match.

Thanks again,

Sean

On Wednesday, February 22, 2017 at 4:38:44 PM UTC, Bryan Van de ven wrote:
Well like I said I think think there is something unexpected about the structure of your particular data frame, though offhand I don't know what. What is the output of df.info? E.g.

df.info()<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
sepal_length 150 non-null float64
sepal_width 150 non-null float64
petal_length 150 non-null float64
petal_width 150 non-null float64
species 150 non-null object
dtypes: float64(4), object(1)
memory usage: 5.9+ KB

Thanks,

Bryan

> On Feb 22, 2017, at 10:25, MrShookshank <[email protected]> wrote:
>
> Hi Bryan,
>
> firstly sorry for the lack of information,
>
> bokeh version = 0.12.4
> pandas version = 0.19.2
>
> My instinct was that the dataframe was somehow broken but unsure where.
>
> My full code is;
>
> from bokeh.plotting import ColumnDataSource
> from sqlalchemy import create_engine
> from config import *
> import pandas as pd
>
> engine = create_engine(mysql['connection'], pool_size=20, max_overflow=0)
>
> df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id '
> 'LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;', engine).dropna()
>
> df2 = pd.read_sql_query('SELECT ip_src,ip_dst FROM iphdr;', engine)
>
> df3 = pd.read_sql_query('SELECT tcp_sport, tcp_dport FROM tcphdr;', engine)
>
> df4 = pd.concat([df, df2, df3], axis=1)
>
> source = ColumnDataSource(df4)
>
> print(source)
>
> I have also tried adding .values.tolist() to df4 which results in error: ValueError: expected an element of List(String), got seq with invalid items [0]
>
> No pressure to spend to solving this for me, im just unsure whether its an issue in bokeh, pandas or just my own ignorance :smiley:
>
> Im not sure how I could provide data as its within a MySQL database with 10k rows.
>
> Thanks,
>
> Sean
>
>
>
> On Wednesday, February 22, 2017 at 3:10:03 PM UTC, Bryan Van de ven wrote:
> Please note: you have not specified the version of Bokeh, or the version of Pandas. It's always advised to provide this kind of information when asking for help.
>
> There is something unexpected or unusual about this DataFrame. You can look at the CDS code that coverts DataFames, it's only a few lines of code:
>
> https://github.com/bokeh/bokeh/blob/master/bokeh/models/sources.py#L112-L133
>
> As you can see, it calls .tolist on the columns of the DataFrame, which is normally fine. Is yours somehow nested? It's really hard to say anything more specific unless you can provide code and data to reproduce the problem.
>
> Thanks,
>
> Bryan
>
>
> > On Feb 22, 2017, at 07:16, MrShookshank <[email protected]> wrote:
> >
> > Hi fellow bokeh users,
> >
> > I have the following code:
> >
> > '''
> > from bokeh.plotting import ColumnDataSource
> > from sqlalchemy import create_engine
> > from config import *
> > import pandas as pd
> >
> > engine = create_engine(mysql['connection'], pool_size=20, max_overflow=0)
> >
> > df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id '
> > 'LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;', engine)
> >
> > source = ColumnDataSource(df)
> >
> > print(source)
> > '''
> >
> > the dataframe the query creates prints fine. However when I attempt to create the ColumnDataSource, i get this error;
> >
> > "AttributeError: 'DataFrame' object has no attribute 'tolist'"
> >
> > Any help on this issue would be appreciated hugely.
> >
> > Cheers.
> >
> > --
> > You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to bokeh+un...@continuum.io.
> > To post to this group, send email to bo...@continuum.io.
> > To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/001de17b-2115-487c-bbbf-d07f2029cde7%40continuum.io\.
> > For more options, visit https://groups.google.com/a/continuum.io/d/optout\.
>
>
> --
> You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bokeh+un...@continuum.io.
> To post to this group, send email to bo...@continuum.io.
> To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/f71272be-54ac-4fdc-99ca-0ffa46a257c7%40continuum.io\.
> For more options, visit https://groups.google.com/a/continuum.io/d/optout\.

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/8ad8024a-1767-4d0e-b130-572d8d34254b%40continuum.io\.
For more options, visit https://groups.google.com/a/continuum.io/d/optout\.

it seems to be the ‘df’ dataframe that throws the exception ‘DataFrame’ has no Attribute ‘tolist’, which ofcourse has the most complex sql query.

The other dataframes seem to be fine, apart from the concat’ed one.

‘df’ does work if I convert to list using .values.tolist(), however this is not optimal as i cannot concatenate with the other frames.

im very unsure what the next step to take is.

Thanks again,

Sean

···

On Wednesday, February 22, 2017 at 5:11:27 PM UTC, Bryan Van de ven wrote:

Nothing looks immediately out of hand, except as you note, the rows don’t line up. Another quick check: does passing each of the original data frames to CDS work? i.e. is it only the concat’ed data frame that fails?

Thanks,

Bryan

On Feb 22, 2017, at 10:53, MrShookshank [email protected] wrote:

Hi Bryan,

Yes I should have checked my data previously, im quite new to bokeh and pandas in general.

Here is the info of each dataframe before handing to the ColumnDataSource.

df:

RangeIndex: 14063 entries, 0 to 14062

Data columns (total 13 columns):

sid 14063 non-null int64

cid 14063 non-null int64

signature 14063 non-null int64

timestamp 14063 non-null datetime64[ns]

sig_id 14063 non-null int64

sig_name 14063 non-null object

sig_class_id 14063 non-null int64

sig_priority 14063 non-null int64

sig_rev 14063 non-null int64

sig_sid 14063 non-null int64

sig_gid 14063 non-null int64

sig_class_id 14063 non-null int64

sig_class_name 14063 non-null object

dtypes: datetime64ns, int64(10), object(2)

memory usage: 1.4+ MB

df2:

RangeIndex: 14066 entries, 0 to 14065

Data columns (total 2 columns):

ip_src 14066 non-null int64

ip_dst 14066 non-null int64

dtypes: int64(2)

memory usage: 219.9 KB

df3:

RangeIndex: 14070 entries, 0 to 14069

Data columns (total 2 columns):

tcp_sport 14070 non-null int64

tcp_dport 14070 non-null int64

dtypes: int64(2)

memory usage: 219.9 KB

df4:

RangeIndex: 14074 entries, 0 to 14073

Data columns (total 17 columns):

sid 14074 non-null int64

cid 14074 non-null int64

signature 14074 non-null int64

timestamp 14074 non-null datetime64[ns]

sig_id 14074 non-null int64

sig_name 14074 non-null object

sig_class_id 14074 non-null int64

sig_priority 14074 non-null int64

sig_rev 14074 non-null int64

sig_sid 14074 non-null int64

sig_gid 14074 non-null int64

sig_class_id 14074 non-null int64

sig_class_name 14074 non-null object

ip_src 14074 non-null int64

ip_dst 14074 non-null int64

tcp_sport 14074 non-null int64

tcp_dport 14074 non-null int64

dtypes: datetime64ns, int64(14), object(2)

memory usage: 1.8+ MB

Looking at this now makes me think there are several issues to contend with as the rows do not seem to match.

Thanks again,

Sean

On Wednesday, February 22, 2017 at 4:38:44 PM UTC, Bryan Van de ven wrote:

Well like I said I think think there is something unexpected about the structure of your particular data frame, though offhand I don’t know what. What is the output of df.info? E.g.

df.info()<class ‘pandas.core.frame.DataFrame’>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
sepal_length 150 non-null float64
sepal_width 150 non-null float64
petal_length 150 non-null float64
petal_width 150 non-null float64
species 150 non-null object
dtypes: float64(4), object(1)
memory usage: 5.9+ KB

Thanks,

Bryan

On Feb 22, 2017, at 10:25, MrShookshank [email protected] wrote:

Hi Bryan,

firstly sorry for the lack of information,

bokeh version = 0.12.4
pandas version = 0.19.2

My instinct was that the dataframe was somehow broken but unsure where.

My full code is;

from bokeh.plotting import ColumnDataSource
from sqlalchemy import create_engine
from config import *
import pandas as pd

engine = create_engine(mysql[‘connection’], pool_size=20, max_overflow=0)

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id ’
‘LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;’, engine).dropna()

df2 = pd.read_sql_query(‘SELECT ip_src,ip_dst FROM iphdr;’, engine)

df3 = pd.read_sql_query(‘SELECT tcp_sport, tcp_dport FROM tcphdr;’, engine)

df4 = pd.concat([df, df2, df3], axis=1)

source = ColumnDataSource(df4)

print(source)

I have also tried adding .values.tolist() to df4 which results in error: ValueError: expected an element of List(String), got seq with invalid items [0]

No pressure to spend to solving this for me, im just unsure whether its an issue in bokeh, pandas or just my own ignorance :smiley:

Im not sure how I could provide data as its within a MySQL database with 10k rows.

Thanks,

Sean

On Wednesday, February 22, 2017 at 3:10:03 PM UTC, Bryan Van de ven wrote:
Please note: you have not specified the version of Bokeh, or the version of Pandas. It’s always advised to provide this kind of information when asking for help.

There is something unexpected or unusual about this DataFrame. You can look at the CDS code that coverts DataFames, it’s only a few lines of code:

    [https://github.com/bokeh/bokeh/blob/master/bokeh/models/sources.py#L112-L133](https://github.com/bokeh/bokeh/blob/master/bokeh/models/sources.py#L112-L133)

As you can see, it calls .tolist on the columns of the DataFrame, which is normally fine. Is yours somehow nested? It’s really hard to say anything more specific unless you can provide code and data to reproduce the problem.

Thanks,

Bryan

On Feb 22, 2017, at 07:16, MrShookshank [email protected] wrote:

Hi fellow bokeh users,

I have the following code:

‘’’
from bokeh.plotting import ColumnDataSource
from sqlalchemy import create_engine
from config import *
import pandas as pd

engine = create_engine(mysql[‘connection’], pool_size=20, max_overflow=0)

df = pd.read_sql_query('SELECT * FROM event LEFT JOIN signature ON event.signature=signature.sig_id ’
‘LEFT JOIN sig_class ON signature.sig_class_id=sig_class.sig_class_id;’, engine)

source = ColumnDataSource(df)

print(source)
‘’’

the dataframe the query creates prints fine. However when I attempt to create the ColumnDataSource, i get this error;

“AttributeError: ‘DataFrame’ object has no attribute ‘tolist’”

Any help on this issue would be appreciated hugely.

Cheers.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/001de17b-2115-487c-bbbf-d07f2029cde7%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/f71272be-54ac-4fdc-99ca-0ffa46a257c7%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/8ad8024a-1767-4d0e-b130-572d8d34254b%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.