E-1019 (duplicate_factors)

Hello Bokeh community,

I’m having an issue with a vertical bar plot with categorical data (x-axis).

What are you trying to do?
I’m trying to plot a list of names in the horizontal axis and a number of days for each name in the vertical axis.

What have you tried that did NOT work as expected?
I either get a blank output with no error message, or I get no plot with the following error message.
I’m also unable to find a good solution by Googling this issue.

ERROR:bokeh.core.validation.check:E-1019 (DUPLICATE_FACTORS): FactorRange must specify a unique list of categorical factors for an axis: duplicate factors found: ‘I’

I don’t see any duplicated category, nor pandas is finding any duplicated value.

Here is my code (I have extra imported packages not being used for the moment).
If anyone can point me in the right direction, much appreciated.

Bokeh version: 2.3.0

"|IMPORT PACKAGES|"
import numpy              as np
import pandas             as pd
import datetime
from   bokeh.plotting     import show, figure, output_file, save
from   bokeh.io           import show, output_notebook, curdoc, export_png
from   bokeh.models       import ColumnDataSource,LinearAxis, Range1d, NumeralTickFormatter, LabelSet, Label, BoxAnnotation, DatetimeTickFormatter, Text, Span
from   bokeh.models.tools import HoverTool
from   bokeh.models       import Arrow, NormalHead, OpenHead, VeeHead
from   bokeh.transform    import dodge
from   datetime           import datetime as dt

"|IMPORT DATA|"
path = r'https://github.com/ncachanosky/research/blob/master/Economic%20Series/'
file = r'Resumen%20Estadistico%20-%20Argentina.xlsx?raw=true'
IO   = path + file

sheet = 'DEFICIT FINANCIERO'

data = pd.read_excel(IO, sheet_name = sheet, usecols="AD,AG:AH", nrows=65, engine='openpyxl') # Be patient...

data = data[39:]

"|CHECK DATA|"
data

"|BUILD PLOT|"

cds = ColumnDataSource(data)

#BUILD FIGURE
p = figure(title        = 'MINISTROS DE ECONOMÍA',
           x_axis_label = '',
           y_axis_label = 'Dias',
           x_range      = 'MINISTRO',
           y_range      = '',
           plot_height  = 400,
           plot_width   = 700)

p.toolbar_location = "above"
p.toolbar.autohide = True

#AXIS 1 (LEFT)
p.vbar(x='MINISTRO', top='DAYS', color='blue', width=0.25, fill_alpha=0.50, legend_label='Días por ministro', muted_alpha=0.2, source=cds)

#LEGEND
p.legend.location     = "top_left"
p.legend.orientation  = "horizontal"
p.legend.click_policy = "mute"
show(p)

#DUPLICATES?
data.MINISTRO.duplicated()

@ncachanosky The problem is almost certainly this:

x_range      = 'MINISTRO',

For categorical ranges, the value of x_range should be a list of strings (the list of factors, in the order you want them to appear on the axis) . But you have passed a single string, not a list of strings. I am guessing this is some Python duck-typing causing the single string to be interpreted as a list of single character strings, and in that case the “I” is duplicated.

1 Like

Thank you!

I wouldn’t have figured that one out.

However, if I use x_range = data[‘MINISTRO’] I do not get an error message. But, I do get an empty output. There is a frame for the plot, but nothing is there.

@ncachanosky It’s not possible to speculate without knowing what the contents of data actually is. Otherwise I can only reiterate that the value passed to x_range should be a list of strings, in the order you want them to appear on the axis, containing no duplicates. There is more information and examples in the User’s Guide:

Handling categorical data — Bokeh 2.4.2 Documentation

Also FYI there may also be relevant error messages in the browser’s JavaScript console.

I understand. I’m passing the first column as the categories for the x-axis.

Also, data.MINISTRO.duplicated() finds no duplicated values for the column “MINISTRO”.

MINISTRO DAYS AVG
39 Bernardo Grinspún 436 501.807692
40 Juan Vital Sourrouille 1501 501.807692
41 Juan Carlos Pugliese 44 501.807692
42 Jesús Rodriguez 55 501.807692
43 Miguel Roig 8 501.807692
44 Nestor Rapanelli 153 501.807692
45 Antonio Erman Gonzales 412 501.807692
46 Domingo Cavallo (1) 2016 501.807692
47 Roque Fernandez 1190 501.807692
48 Jose Luis Machinea 448 501.807692
49 Ricardo Lopez Murphy 14 501.807692
50 Domingo Cavallo (2) 274 501.807692
51 Jorge Capitanich 2 501.807692
52 Rodolfo Frigerio 7 501.807692
53 Jorge Remes Lenicov 57 501.807692
54 Roberto Lavagna 1310 501.807692
55 Felisa Miceli 595 501.807692
56 Miguel Gustavo Peirano 147 501.807692
57 Martin Lusteau 136 501.807692
58 Carlos Fernandez 438 501.807692
59 Amado Boudou 886 501.807692
60 Hernan Lorenzino 709 501.807692
61 Axel Kicillof 751 501.807692
62 Alfonso Prat-Gay 387 501.807692
63 Nicolas Dujovne 959 501.807692
64 Hernan Lacunza 112 501.807692

@ncachanosky This is also going to cause a problem:

y_range      = '',

As before, if this was a categorical range, it should be a list of strings, not a single string (and definitely not the empty string). But it’s not a categorical range (I’m assuming, based on the data, which appears to be numeric along the y-axis), so it should not be a list of strings either.

If you want the standard default auto-ranging, you should not set y_range at all. Otherwise, if you want to set it explicitly, it should be a numeric tuple of (start, end) values.

1 Like

Yes! That was it!

Thank you for the guidance and your patience with this new Bokeh user. :smile:

1 Like