Bar() interface silently fails when used with data that contains numpy.nan

Hi there,

I had some issues when trying to visualise some data using the Bar() class, using information extracted from a Pandas data frame which contained NaNs.

Just thought I’d flag this as it tripped me up with a silent failure in my web app (ended up with a completely blank canvas as none of my plots rendered). Here’s some very brief reproduction code to what I am assuming is the basic issue, using a Bokeh example. It’s a lovely helper class, but I guess my feature request (if possible) would be to explicitly error so people can quickly figure out where the problem is.

Whaddya think?

import numpy
from collections import OrderedDict
import pandas as pd
from bokeh.charts import Bar, output_file, show
from bokeh.sampledata.olympics2014 import data
df = pd.io.json.json_normalize(data[‘data’])

filter by countries with at least one medal and sort

df = df[df[‘medals.total’] > 0]
df = df.sort(“medals.total”, ascending=False)
df[‘medals.bronze’][65] = numpy.nan

get the countries and we group the data by medal type

countries = df.abbr.values.tolist()
gold = df[‘medals.gold’].astype(float).values
silver = df[‘medals.silver’].astype(float).values
bronze = df[‘medals.bronze’].astype(float).values

build a dict containing the grouped data

medals = OrderedDict(bronze=bronze, silver=silver, gold=gold)

output_file(“stacked_bar.html”)

bar = Bar(medals, countries, title=“Stacked bars”, stacked=True)

show(bar)

Related? BEP3: Charts interface · Issue #1373 · bokeh/bokeh · GitHub

···

On Wednesday, May 13, 2015 at 2:13:55 PM UTC+1, Hugo Carr wrote:

Hi there,


I had some issues when trying to visualise some data using the Bar() class, using information extracted from a Pandas data frame which contained NaNs.


Just thought I’d flag this as it tripped me up with a silent failure in my web app (ended up with a completely blank canvas as none of my plots rendered). Here’s some very brief reproduction code to what I am assuming is the basic issue, using a Bokeh example. It’s a lovely helper class, but I guess my feature request (if possible) would be to explicitly error so people can quickly figure out where the problem is.

Whaddya think?

import numpy
from collections import OrderedDict
import pandas as pd
from bokeh.charts import Bar, output_file, show
from bokeh.sampledata.olympics2014 import data
df = pd.io.json.json_normalize(data[‘data’])

filter by countries with at least one medal and sort

df = df[df[‘medals.total’] > 0]
df = df.sort(“medals.total”, ascending=False)
df[‘medals.bronze’][65] = numpy.nan

get the countries and we group the data by medal type

countries = df.abbr.values.tolist()
gold = df[‘medals.gold’].astype(float).values
silver = df[‘medals.silver’].astype(float).values
bronze = df[‘medals.bronze’].astype(float).values

build a dict containing the grouped data

medals = OrderedDict(bronze=bronze, silver=silver, gold=gold)

output_file(“stacked_bar.html”)

bar = Bar(medals, countries, title=“Stacked bars”, stacked=True)

show(bar)

TLDR - To make it work you can do the following:

from bokeh.models import Range1d
bar = Bar(medals, countries, title=“Stacked bars”, stacked=True, continuous_range=Range1d(0, 23))

To make Bokeh better, I’ve opened an issue: https://github.com/bokeh/bokeh/issues/2288

The long-story:

bar.y_range is what’s having the problem. we should have:

bar.y_range.start
0
bar.y_range.end
23.1

but instead we have:

bar.y_range.start
nan
bar.y_range.end

nan

You can work around this by passing in an explicit range with the continuous_range parameter (code above).

The magic where this all happens is here: https://github.com/bokeh/bokeh/blob/master/bokeh/charts/builder/bar_builder.py#L172

There are a bunch of tweaks I am planning to make to Bar. Including the ability to handle +ve and -ve data better, and horizontal bar charts, and even pyramid plots.

However, there is a final push going on in May to solidify the Charts interface and so these tweaks are best handled after that. When I’m poking around that code area again, I’ll take a look at it, and I have opened an issue: https://github.com/bokeh/bokeh/issues/2288

Hope this helps,

Sarah Bird

···

On Wed, May 13, 2015 at 6:15 AM, Hugo Carr [email protected] wrote:

Related? https://github.com/bokeh/bokeh/issues/1373

On Wednesday, May 13, 2015 at 2:13:55 PM UTC+1, Hugo Carr wrote:

Hi there,


I had some issues when trying to visualise some data using the Bar() class, using information extracted from a Pandas data frame which contained NaNs.


Just thought I’d flag this as it tripped me up with a silent failure in my web app (ended up with a completely blank canvas as none of my plots rendered). Here’s some very brief reproduction code to what I am assuming is the basic issue, using a Bokeh example. It’s a lovely helper class, but I guess my feature request (if possible) would be to explicitly error so people can quickly figure out where the problem is.

Whaddya think?

import numpy
from collections import OrderedDict
import pandas as pd
from bokeh.charts import Bar, output_file, show
from bokeh.sampledata.olympics2014 import data
df = pd.io.json.json_normalize(data[‘data’])

filter by countries with at least one medal and sort

df = df[df[‘medals.total’] > 0]
df = df.sort(“medals.total”, ascending=False)
df[‘medals.bronze’][65] = numpy.nan

get the countries and we group the data by medal type

countries = df.abbr.values.tolist()
gold = df[‘medals.gold’].astype(float).values
silver = df[‘medals.silver’].astype(float).values
bronze = df[‘medals.bronze’].astype(float).values

build a dict containing the grouped data

medals = OrderedDict(bronze=bronze, silver=silver, gold=gold)

output_file(“stacked_bar.html”)

bar = Bar(medals, countries, title=“Stacked bars”, stacked=True)

show(bar)

You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/0f89aecb-ab7c-4dd2-a45b-23e05e7e5533%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

I have also run into similar problems where I’ve had some issue with my data source and then am greeted with a completely blank plot. So while gracefully handling np.nan would be great, I would also recommend that Bokeh fail loudly if it doesn’t like the format of your data. The first time running into this was especially difficult as a new user because I had no idea where to look. My solution was to take an example and slowly merge to my project to locate my error. This worked, but was rather tedious.

But really though, my thanks to everyone working on Bokeh. I am finding it very useful.

Brent

···

On Thursday, May 14, 2015 at 3:19:34 PM UTC+8, Sarah Bird wrote:

TLDR - To make it work you can do the following:

from bokeh.models import Range1d
bar = Bar(medals, countries, title=“Stacked bars”, stacked=True, continuous_range=Range1d(0, 23))

To make Bokeh better, I’ve opened an issue: https://github.com/bokeh/bokeh/issues/2288

The long-story:

bar.y_range is what’s having the problem. we should have:

bar.y_range.start
0
bar.y_range.end
23.1

but instead we have:

bar.y_range.start
nan
bar.y_range.end

nan

You can work around this by passing in an explicit range with the continuous_range parameter (code above).

The magic where this all happens is here: https://github.com/bokeh/bokeh/blob/master/bokeh/charts/builder/bar_builder.py#L172

There are a bunch of tweaks I am planning to make to Bar. Including the ability to handle +ve and -ve data better, and horizontal bar charts, and even pyramid plots.

However, there is a final push going on in May to solidify the Charts interface and so these tweaks are best handled after that. When I’m poking around that code area again, I’ll take a look at it, and I have opened an issue: https://github.com/bokeh/bokeh/issues/2288

Hope this helps,

Sarah Bird

On Wed, May 13, 2015 at 6:15 AM, Hugo Carr [email protected] wrote:

Related? https://github.com/bokeh/bokeh/issues/1373

On Wednesday, May 13, 2015 at 2:13:55 PM UTC+1, Hugo Carr wrote:

Hi there,


I had some issues when trying to visualise some data using the Bar() class, using information extracted from a Pandas data frame which contained NaNs.


Just thought I’d flag this as it tripped me up with a silent failure in my web app (ended up with a completely blank canvas as none of my plots rendered). Here’s some very brief reproduction code to what I am assuming is the basic issue, using a Bokeh example. It’s a lovely helper class, but I guess my feature request (if possible) would be to explicitly error so people can quickly figure out where the problem is.

Whaddya think?

import numpy
from collections import OrderedDict
import pandas as pd
from bokeh.charts import Bar, output_file, show
from bokeh.sampledata.olympics2014 import data
df = pd.io.json.json_normalize(data[‘data’])

filter by countries with at least one medal and sort

df = df[df[‘medals.total’] > 0]
df = df.sort(“medals.total”, ascending=False)
df[‘medals.bronze’][65] = numpy.nan

get the countries and we group the data by medal type

countries = df.abbr.values.tolist()
gold = df[‘medals.gold’].astype(float).values
silver = df[‘medals.silver’].astype(float).values
bronze = df[‘medals.bronze’].astype(float).values

build a dict containing the grouped data

medals = OrderedDict(bronze=bronze, silver=silver, gold=gold)

output_file(“stacked_bar.html”)

bar = Bar(medals, countries, title=“Stacked bars”, stacked=True)

show(bar)

You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/0f89aecb-ab7c-4dd2-a45b-23e05e7e5533%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Awesome! Thanks very much.

···

On Thursday, May 14, 2015 at 8:19:34 AM UTC+1, Sarah Bird wrote:

TLDR - To make it work you can do the following:

from bokeh.models import Range1d
bar = Bar(medals, countries, title=“Stacked bars”, stacked=True, continuous_range=Range1d(0, 23))

To make Bokeh better, I’ve opened an issue: https://github.com/bokeh/bokeh/issues/2288

The long-story:

bar.y_range is what’s having the problem. we should have:

bar.y_range.start
0
bar.y_range.end
23.1

but instead we have:

bar.y_range.start
nan
bar.y_range.end

nan

You can work around this by passing in an explicit range with the continuous_range parameter (code above).

The magic where this all happens is here: https://github.com/bokeh/bokeh/blob/master/bokeh/charts/builder/bar_builder.py#L172

There are a bunch of tweaks I am planning to make to Bar. Including the ability to handle +ve and -ve data better, and horizontal bar charts, and even pyramid plots.

However, there is a final push going on in May to solidify the Charts interface and so these tweaks are best handled after that. When I’m poking around that code area again, I’ll take a look at it, and I have opened an issue: https://github.com/bokeh/bokeh/issues/2288

Hope this helps,

Sarah Bird

On Wed, May 13, 2015 at 6:15 AM, Hugo Carr [email protected] wrote:

Related? https://github.com/bokeh/bokeh/issues/1373

On Wednesday, May 13, 2015 at 2:13:55 PM UTC+1, Hugo Carr wrote:

Hi there,


I had some issues when trying to visualise some data using the Bar() class, using information extracted from a Pandas data frame which contained NaNs.


Just thought I’d flag this as it tripped me up with a silent failure in my web app (ended up with a completely blank canvas as none of my plots rendered). Here’s some very brief reproduction code to what I am assuming is the basic issue, using a Bokeh example. It’s a lovely helper class, but I guess my feature request (if possible) would be to explicitly error so people can quickly figure out where the problem is.

Whaddya think?

import numpy
from collections import OrderedDict
import pandas as pd
from bokeh.charts import Bar, output_file, show
from bokeh.sampledata.olympics2014 import data
df = pd.io.json.json_normalize(data[‘data’])

filter by countries with at least one medal and sort

df = df[df[‘medals.total’] > 0]
df = df.sort(“medals.total”, ascending=False)
df[‘medals.bronze’][65] = numpy.nan

get the countries and we group the data by medal type

countries = df.abbr.values.tolist()
gold = df[‘medals.gold’].astype(float).values
silver = df[‘medals.silver’].astype(float).values
bronze = df[‘medals.bronze’].astype(float).values

build a dict containing the grouped data

medals = OrderedDict(bronze=bronze, silver=silver, gold=gold)

output_file(“stacked_bar.html”)

bar = Bar(medals, countries, title=“Stacked bars”, stacked=True)

show(bar)

You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/0f89aecb-ab7c-4dd2-a45b-23e05e7e5533%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

I think that’s a very fair comment Brent. Even though I know this I was tripped up by it again this morning.

···

On Thu, May 14, 2015 at 2:11 AM, [email protected] wrote:

I have also run into similar problems where I’ve had some issue with my data source and then am greeted with a completely blank plot. So while gracefully handling np.nan would be great, I would also recommend that Bokeh fail loudly if it doesn’t like the format of your data. The first time running into this was especially difficult as a new user because I had no idea where to look. My solution was to take an example and slowly merge to my project to locate my error. This worked, but was rather tedious.

But really though, my thanks to everyone working on Bokeh. I am finding it very useful.

Brent

On Thursday, May 14, 2015 at 3:19:34 PM UTC+8, Sarah Bird wrote:

TLDR - To make it work you can do the following:

from bokeh.models import Range1d
bar = Bar(medals, countries, title=“Stacked bars”, stacked=True, continuous_range=Range1d(0, 23))

To make Bokeh better, I’ve opened an issue: https://github.com/bokeh/bokeh/issues/2288

The long-story:

bar.y_range is what’s having the problem. we should have:

bar.y_range.start
0
bar.y_range.end
23.1

but instead we have:

bar.y_range.start
nan
bar.y_range.end

nan

You can work around this by passing in an explicit range with the continuous_range parameter (code above).

The magic where this all happens is here: https://github.com/bokeh/bokeh/blob/master/bokeh/charts/builder/bar_builder.py#L172

There are a bunch of tweaks I am planning to make to Bar. Including the ability to handle +ve and -ve data better, and horizontal bar charts, and even pyramid plots.

However, there is a final push going on in May to solidify the Charts interface and so these tweaks are best handled after that. When I’m poking around that code area again, I’ll take a look at it, and I have opened an issue: https://github.com/bokeh/bokeh/issues/2288

Hope this helps,

Sarah Bird

On Wed, May 13, 2015 at 6:15 AM, Hugo Carr [email protected] wrote:

Related? https://github.com/bokeh/bokeh/issues/1373

On Wednesday, May 13, 2015 at 2:13:55 PM UTC+1, Hugo Carr wrote:

Hi there,


I had some issues when trying to visualise some data using the Bar() class, using information extracted from a Pandas data frame which contained NaNs.


Just thought I’d flag this as it tripped me up with a silent failure in my web app (ended up with a completely blank canvas as none of my plots rendered). Here’s some very brief reproduction code to what I am assuming is the basic issue, using a Bokeh example. It’s a lovely helper class, but I guess my feature request (if possible) would be to explicitly error so people can quickly figure out where the problem is.

Whaddya think?

import numpy
from collections import OrderedDict
import pandas as pd
from bokeh.charts import Bar, output_file, show
from bokeh.sampledata.olympics2014 import data
df = pd.io.json.json_normalize(data[‘data’])

filter by countries with at least one medal and sort

df = df[df[‘medals.total’] > 0]
df = df.sort(“medals.total”, ascending=False)
df[‘medals.bronze’][65] = numpy.nan

get the countries and we group the data by medal type

countries = df.abbr.values.tolist()
gold = df[‘medals.gold’].astype(float).values
silver = df[‘medals.silver’].astype(float).values
bronze = df[‘medals.bronze’].astype(float).values

build a dict containing the grouped data

medals = OrderedDict(bronze=bronze, silver=silver, gold=gold)

output_file(“stacked_bar.html”)

bar = Bar(medals, countries, title=“Stacked bars”, stacked=True)

show(bar)

You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/0f89aecb-ab7c-4dd2-a45b-23e05e7e5533%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Agreed. Thanks for your feedback.

Turning silent fails and errors to be more explicit is definitely something we have being discussing and part of the things we look forward to improving in the next future. It’s definitely part of those things that that improves user experience.

Cheers

Fabio

···

On Thu, May 14, 2015 at 8:23 PM, Sarah Bird [email protected] wrote:

I think that’s a very fair comment Brent. Even though I know this I was tripped up by it again this morning.

You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/CA%2BEr%2BdRgs_k_%2B4k7SZcnoaRqF0c102v7Lb_CZzeqfe%3D-a0XJUQ%40mail.gmail.com.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

On Thu, May 14, 2015 at 2:11 AM, [email protected] wrote:

I have also run into similar problems where I’ve had some issue with my data source and then am greeted with a completely blank plot. So while gracefully handling np.nan would be great, I would also recommend that Bokeh fail loudly if it doesn’t like the format of your data. The first time running into this was especially difficult as a new user because I had no idea where to look. My solution was to take an example and slowly merge to my project to locate my error. This worked, but was rather tedious.

But really though, my thanks to everyone working on Bokeh. I am finding it very useful.

Brent

On Thursday, May 14, 2015 at 3:19:34 PM UTC+8, Sarah Bird wrote:

TLDR - To make it work you can do the following:

from bokeh.models import Range1d
bar = Bar(medals, countries, title=“Stacked bars”, stacked=True, continuous_range=Range1d(0, 23))

To make Bokeh better, I’ve opened an issue: https://github.com/bokeh/bokeh/issues/2288

The long-story:

bar.y_range is what’s having the problem. we should have:

bar.y_range.start
0
bar.y_range.end
23.1

but instead we have:

bar.y_range.start
nan
bar.y_range.end

nan

You can work around this by passing in an explicit range with the continuous_range parameter (code above).

The magic where this all happens is here: https://github.com/bokeh/bokeh/blob/master/bokeh/charts/builder/bar_builder.py#L172

There are a bunch of tweaks I am planning to make to Bar. Including the ability to handle +ve and -ve data better, and horizontal bar charts, and even pyramid plots.

However, there is a final push going on in May to solidify the Charts interface and so these tweaks are best handled after that. When I’m poking around that code area again, I’ll take a look at it, and I have opened an issue: https://github.com/bokeh/bokeh/issues/2288

Hope this helps,

Sarah Bird

On Wed, May 13, 2015 at 6:15 AM, Hugo Carr [email protected] wrote:

Related? https://github.com/bokeh/bokeh/issues/1373

On Wednesday, May 13, 2015 at 2:13:55 PM UTC+1, Hugo Carr wrote:

Hi there,


I had some issues when trying to visualise some data using the Bar() class, using information extracted from a Pandas data frame which contained NaNs.


Just thought I’d flag this as it tripped me up with a silent failure in my web app (ended up with a completely blank canvas as none of my plots rendered). Here’s some very brief reproduction code to what I am assuming is the basic issue, using a Bokeh example. It’s a lovely helper class, but I guess my feature request (if possible) would be to explicitly error so people can quickly figure out where the problem is.

Whaddya think?

import numpy
from collections import OrderedDict
import pandas as pd
from bokeh.charts import Bar, output_file, show
from bokeh.sampledata.olympics2014 import data
df = pd.io.json.json_normalize(data[‘data’])

filter by countries with at least one medal and sort

df = df[df[‘medals.total’] > 0]
df = df.sort(“medals.total”, ascending=False)
df[‘medals.bronze’][65] = numpy.nan

get the countries and we group the data by medal type

countries = df.abbr.values.tolist()
gold = df[‘medals.gold’].astype(float).values
silver = df[‘medals.silver’].astype(float).values
bronze = df[‘medals.bronze’].astype(float).values

build a dict containing the grouped data

medals = OrderedDict(bronze=bronze, silver=silver, gold=gold)

output_file(“stacked_bar.html”)

bar = Bar(medals, countries, title=“Stacked bars”, stacked=True)

show(bar)

You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/0f89aecb-ab7c-4dd2-a45b-23e05e7e5533%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Fabio Pliger