Need Simple Example of Melting a Dataframe to Create a Nested Bar Chart

I am having difficulty grokking how to go from a melted datatable through a column data source to a nested bar chart. The below code does not work, and nets me “Expected x to reference fields in the supplied data source.” ( with some nice explanation btw, kudos ).

My desire is to produce something looking like this example: https://docs.bokeh.org/en/latest/docs/gallery/bar_nested_colormapped.html
I will need to be able to dynamically change what is plotted in the bar chart using CustomJS, but to do that I will need to get a CDS approach working.

My initial non-working attempt to use CDS

df_input = load_data("sppt_measurement_archive.csv")
melted_df = pd.melt(df_input, id_vars=['SOC', 'Frequency'], value_vars=['Avg_Power'])
freqs = df_input['Frequency'].unique().tolist()
socs = df_input['SOC'].unique().tolist()
x = [(soc, freq) for soc in socs for freq in freqs]
p = figure(x_range=FactorRange(*x), plot_height=250, title="Power",
           toolbar_location=None, tools="")
p.xgrid.grid_line_color = None
p.y_range.start = 0

p.vbar(x=['Frequency', 'SOC'], top='value', width=0.9, source=source)

The following code does render, albeit with too much information to be legible.

df_input = load_data("sppt_measurement_archive.csv")
melted_df = pd.melt(df_input, id_vars=['SOC', 'Frequency', 'Workload'], value_vars=['Avg_Power'])

freqs = df_input['Frequency'].unique().tolist()
socs = df_input['SOC'].unique().tolist()

x = [(soc, freq) for soc in socs for freq in freqs]
y = melted_df['value'].tolist()

p = figure(x_range=FactorRange(*x), plot_height=250, title="Power",
           toolbar_location=None, tools="")
p.xgrid.grid_line_color = None
p.y_range.start = 0

p.vbar(x=x, top=y, width=0.9)
show(p)

Can you give an example of the data?

I put together a csv file and example code to show what I am getting when I use the non-CDS method. Just of note, I eventually want users to be able to choose SOCs, Workloads, and Frequencies to determine what bars get displayed, and choose between Score and Power for the vertical axis, through CustomJS controls.

Avg_Power Frequency Score SOC Workload
500 500 2 R222 Heavy
600 1000 4 R222 Heavy
700 1200 8 R222 Heavy
900 1600 16 R222 Heavy
400 500 3 C3PA Heavy
800 1000 6 C3PA Heavy
900 1200 9 C3PA Heavy
1100 1600 12 C3PA Heavy
300 500 1 IG44 Heavy
450 1000 8 IG44 Heavy
750 1200 16 IG44 Heavy
850 1600 64 IG44 Heavy
600 500 5 BBATE Heavy
1100 1000 7 BBATE Heavy
1300 1200 11 BBATE Heavy
1700 1600 13 BBATE Heavy
50 500 0 R222 Light
60 1000 1 R222 Light
70 1200 2 R222 Light
90 1600 3 R222 Light
40 500 3 C3PA Light
80 1000 1 C3PA Light
90 1200 4 C3PA Light
110 1600 2 C3PA Light
30 500 6 IG44 Light
45 1000 4 IG44 Light
75 1200 8 IG44 Light
85 1600 2 IG44 Light
60 500 14 BBATE Light
110 1000 11 BBATE Light
130 1200 7 BBATE Light
170 1600 3 BBATE Light
from bokeh.io import show
from bokeh.models import FactorRange
from bokeh.plotting import figure
from bokeh.models.layouts import Column
import pandas as pd

df_input = pd.read_csv('SampleData.csv', skiprows=0, index_col=False, header=0)

melted_df = pd.melt(df_input, id_vars=['SOC', 'Frequency', 'Workload'], value_vars=['Avg_Power'])

freqs = df_input['Frequency'].unique().tolist()
socs = df_input['SOC'].unique().tolist()
workloads = df_input['Workload'].unique().tolist()
x = [(str(workload), str(freq), str(soc)) for soc in socs for freq in freqs for workload in workloads]
x2 = [(str(freq), str(soc)) for soc in socs for freq in freqs]
y = melted_df['value'].tolist()

p = figure(x_range=FactorRange(*x), plot_height=250, title="Power",
           toolbar_location=None, tools="")
p.xgrid.grid_line_color = None
p.y_range.start = 0

p.vbar(x=x, top=y, width=0.9)

p2 = figure(x_range=FactorRange(*x2), plot_height=250, title="Power",
           toolbar_location=None, tools="")
p2.xgrid.grid_line_color = None
p2.y_range.start = 0

p2.vbar(x=x2, top=y, width=0.9)
show(Column(p, p2))

Issues with your code:

  1. You don’t need melt since you have only one value column. Just use the column as is, that’s it
  2. If you want your values align with the axis, you have to provide the same x both to FactorRange and to vbar. You have to sort x for FactorRange manually - it will display it just the way you pass it, it won’t sort or group anything by itself

Try this:

df_input = df_input.sort_values(['Workload', 'Frequency', 'SOC'])
x = list(zip(df_input['Workload'], df_input['Frequency'].astype(str), df_input['SOC']))
ds = ColumnDataSource(dict(x=x, power=df_input['Avg_Power']))

p = figure(x_range=FactorRange(factors=x), plot_height=250, title="Power",
           toolbar_location=None, tools="", sizing_mode='stretch_both')
p.xgrid.grid_line_color = None
p.y_range.start = 0
p.vbar(x='x', top='power', width=0.9, source=ds)