Can't render heatmap data for Apache Zeppelin's pyspark dataframe

Hi! I have already raised this question here: python - What is wrong with Bokeh image() plotting? It succeed but showed no graph - Stack Overflow

0

I have initially a Spark dataframe with data like that:

+-------------------+--------------+------+-----+
|window_time        |delayWindowEnd|values|index|
+-------------------+--------------+------+-----+
|2022-01-24 18:00:00|999           |999   |2    |
|2022-01-24 19:00:00|999           |999   |1    |
|2022-01-24 20:00:00|999           |999   |3    |
|2022-01-24 21:00:00|999           |999   |4    |
|2022-01-24 22:00:00|999           |999   |5    |
|2022-01-24 18:00:00|998           |998   |4    |
|2022-01-24 19:00:00|998           |998   |5    |
|2022-01-24 20:00:00|998           |998   |3    |

and I’d like to plot that as a heatmap with the following code in Apache Zeppelin:

%spark.pyspark

import bkzep
import numpy as np
from bokeh.io import output_notebook, show
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource, ColorBar, LogColorMapper
from bokeh.layouts import gridplot
from pyspark.sql.functions import col, coalesce, lit, monotonically_increasing_id
from pyspark.sql import DataFrame
from pyspark.sql.functions import *

output_notebook(notebook_type='zeppelin')

then

%pyspark

from pyspark.sql.functions import *

def plot_summaries(sensor, dfName):
    df = sqlContext.table(dfName)
    pdf = df.toPandas()
    source = ColumnDataSource(pdf)

    color_mapper = LogColorMapper(palette="Viridis256", low=1, high=10)

    plot = figure(toolbar_location=None,x_axis_type='datetime')
    plot.image(x='window_time', y='delayWindowEnd', source=source, image='index',dw=1,dh=1,  color_mapper=color_mapper)

    color_bar = ColorBar(color_mapper=color_mapper, label_standoff=12)

    plot.add_layout(color_bar, 'right')
    show(gridplot([plot], ncols=1, plot_width=1000, plot_height=400))

sensors = [   
"all"
]

and then finally

%pyspark

from pyspark.sql.functions import *

keyCol = "month_day_hour"

sensors = [
    "all"]


for sensor in sensors:
    plot_summaries(sensor, "maxmin2")   

The latest one has been succeed, but I see no graph.

If the reason is browser rendering, there is a JS error like below:

polyfills.d42c9551b0788083cd69.js:1 Uncaught Error: Error rendering Bokeh model: could not find #fb19be38-e25a-4ebf-a488-593cd2e9a4d6 HTML tag
    at o (bokeh-1.3.4.min.js:31:143801)
    at Object.n._resolve_root_elements (bokeh-1.3.4.min.js:31:144274)
    at Object.n.embed_items_notebook (bokeh-1.3.4.min.js:31:147281)
    at embed_document (<anonymous>:6:20)
    at <anonymous>:15:9
    at e.invokeTask (polyfills.d42c9551b0788083cd69.js:1:8063)
    at t.runTask (polyfills.d42c9551b0788083cd69.js:1:3241)
    at t.invokeTask (polyfills.d42c9551b0788083cd69.js:1:9170)
    at i.useG.invoke (polyfills.d42c9551b0788083cd69.js:1:9061)
    at n.args.<computed> (polyfills.d42c9551b0788083cd69.js:1:38948)

While the responce from Zeppelin backend with the execution and plotting results, reached the browser through websocket app, looks pretty and rather correct

That’s probably because of parameters misuse.
Is it ok to use dataframe column as image parameter (while other twos will be x and y axis). Are df and dw correctly initialized? It is ok to have X axis being a timestamp? I see in some discussions that browser error still could be beacuse of wrong parameter usage still the serverside responds wit no error ( like here Error rendering Bokeh model: could not find HTML tag )

https://pastebin.com/pLWBA8Cv that’s the server responce via the websocket that isn’t rendered

Here I expect graph to be plotted and it is plotted if it isn't a heatmap with the code above

Bokeh’s Image glyph takes a 2D array containing pixel values, the coords for the lower left corner of the image, and the width/height of each pixel. Your input dataframe is a flat table containing the coords of each “pixel”, and is probably better served using the Rect glyph (see https://docs.bokeh.org/en/2.4.2/docs/reference/plotting/figure.html?highlight=rect#bokeh.plotting.Figure.rect ).

If you are dead set on using the image glyph you will need to rearrange your data similar to this example: Dynamically updating multiple 2D numpy arrays in Image Glyph - #2 by gmerritt123 , which would probably involve pandas’ pivot and to_numpy methods before initializing the columndatasource.

Thank you for your answer! Looks like I have understood the idea, but still getting problems with some API when trying to implement it:

I have pdf pandas dataframe and extract a 2d array of it, naming it A

pdf = df.toPandas()
rowIDs = pdf['values']
colIDs = pdf['window_time']
A = pdf.pivot_table('index', 'values', 'window_time', fill_value=0)

But then I’m not sure how to initialize the ColumnDataSource to pass both pdf with column names and 2d array to it, as I can’t find any initialization constructor using both of it:

#source = ColumnDataSource(pdf) -- thats previosly worked for conventional graphs

src = ColumnDataSource(data={'x':[0],'y':[0],'dw':[10],'dh':[10],'im':[A]})

How I can pass both A for im and pdf for forming x and y? I can’t find the sample meaning of the data structure. Is there an example of complex ColumnDataSource initialization?

rowIDs = pdf['values']
colIDs = pdf['window_time']

A = pdf.pivot_table('index', 'values', 'window_time', fill_value=0)

print(A)
src = ColumnDataSource(data={'x':A.window_time,'y':A.values,'dw':[10],'dh':[10],'im':A})
#source = ColumnDataSource(pdf)

color_mapper = LogColorMapper(palette="Viridis256", low=1, high=100)

plot = figure(toolbar_location=None,x_axis_type='datetime')
plot.image(x='x', y='y', source=src, image='im',dw=1,dh=1,  color_mapper=color_mapper)

Here I have problems with putting the right dimension. I expect to use A for getting the dimensions with

‘x’:A.window_time,‘y’:A.values

But I’m getting the error like that:


[1786 rows x 5 columns]
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-491-a9917aea4e22> in <module>
      8 
      9 for sensor in sensors:
---> 10     plot_summaries(sensor, "maxmin2")

<ipython-input-488-75a9d7625b7d> in plot_summaries(sensor, dfName)
     18 
     19     print(A)
---> 20     src = ColumnDataSource(data={'x':A.window_time,'y':A.values,'dw':[10],'dh':[10],'im':A})
     21     #source = ColumnDataSource(pdf)
     22 

/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py in __getattr__(self, name)
   5178             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5179                 return self[name]
-> 5180             return object.__getattribute__(self, name)
   5181 
   5182     def __setattr__(self, name, value):

AttributeError: 'DataFrame' object has no attribute 'window_time'

So how I can access the needed dimension of A dataframe?

Ok, technically I have done everything to be executed with no python error with the following code (still not understanding the parameters of data structure and only making all columns to have the same size)

rowIDs = pdf['values']
    colIDs = pdf['window_time']

    A = pdf.pivot_table('index', 'values', 'window_time', fill_value=0)

    print(A)
    print(A.columns)
    src = ColumnDataSource(data={'x':A,'y':A,'dw':A,'dh':A,'im':A})
    #source = ColumnDataSource(pdf)
  

    color_mapper = LogColorMapper(palette="Viridis256", low=1, high=100)

    plot = figure(toolbar_location=None,x_axis_type='datetime')
    plot.image(x='x', y='y', source=src, image='im',dw=1,dh=1,  color_mapper=color_mapper)

But I still don’t get any mage (even randomly and uncontrolled way plotted).

And JS console gives me:


bokeh-1.3.4.min.js:31 Uncaught TypeError: CreateListFromArrayLike called on non-object
    at Object.l [as concat] (bokeh-1.3.4.min.js:31:83812)
    at e._set_data (bokeh-1.3.4.min.js:31:285777)
    at e.n.set_data (bokeh-1.3.4.min.js:31:276437)
    at e.set_data (bokeh-1.3.4.min.js:31:418226)
    at e.initialize (bokeh-1.3.4.min.js:31:416480)
    at e [as constructor] (bokeh-1.3.4.min.js:31:118431)
    at e [as constructor] (bokeh-1.3.4.min.js:31:20090)
    at e [as constructor] (bokeh-1.3.4.min.js:31:426703)
    at e [as constructor] (bokeh-1.3.4.min.js:31:414746)
    at new e (bokeh-1.3.4.min.js:31:415476)

"<script type=\"text/javascript\">(function(root) {\n function embed_document(root) {\n \n var docs_json = {\"9effa7f3-929a-45ae-af79-1be68dfb3616\":{\"roots\":{\"references\":[{\"attributes\":{\"children\":[{\"id\":\"21616\",\"type\":\"ToolbarBox\"},{\"id\":\"21614\",\"type\":\"GridBox\"}]},\"id\":\"21617\",\"type\":\"Column\"},{\"attributes\":{\"months\":[0,2,4,6,8,10]},\"id\":\"21606\",\"type\":\"MonthsTicker\"},{\"attributes\":{\"ticker\":{\"id\":\"21562\",\"type\":\"DatetimeTicker\"}},\"id\":\"21565\",\"type\":\"Grid\"},{\"attributes\":{\"formatter\":{\"id\":\"21594\",\"type\":\"DatetimeTickFormatter\"},\"ticker\":{\"id\":\"21562\",\"type\":\"DatetimeTicker\"}},\"id\":\"21561\",\"type\":\"DatetimeAxis\"},{\"attributes\":{\"tools\":[{\"id\":\"21571\",\"type\":\"PanTool\"},{\"id\":\"21572\",\"type\":\"WheelZoomTool\"},{\"id\":\"21573\",\"type\":\"BoxZoomTool\"},{\"id\":\"21574\",\"type\":\"SaveTool\"},{\"id\":\"21575\",\"type\":\"ResetTool\"},{\"id\":\"21576\",\"type\":\"HelpTool\"}]},\"id\":\"21615\",\"type\":\"ProxyToolbar\"},{\"attributes\":{\"overlay\":{\"id\":\"21610\",\"type\":\"BoxAnnotation\"}},\"id\":\"21573\",\"type\":\"BoxZoomTool\"},{\"attributes\":{\"color_mapper\":{\"id\":\"21551\",\"type\":\"LogColorMapper\"},\"dh\":{\"units\":\"data\",\"value\":1},\"dw\":{\"units\":\"data\",\"value\":1},\"image\":{\"field\":\"im\"},\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y\"}},\"id\":\"21586\",\"type\":\"Image\"},{\"attributes\":{\"high\":100,\"low\":1,\"palette\":[\"#440154\",\"#440255\",\"#440357\",\"#450558\",\"#45065A\",\"#45085B\",\"#46095C\",\"#460B5E\",\"#460C5F\",\"#460E61\",\"#470F62\",\"#471163\",\"#471265\",\"#471466\",\"#471567\",\"#471669\",\"#47186A\",\"#48196B\",\"#481A6C\",\"#481C6E\",\"#481D6F\",\"#481E70\",\"#482071\",\"#482172\",\"#482273\",\"#482374\",\"#472575\",\"#472676\",\"#472777\",\"#472878\",\"#472A79\",\"#472B7A\",\"#472C7B\",\"#462D7C\",\"#462F7C\",\"#46307D\",\"#46317E\",\"#45327F\",\"#45347F\",\"#453580\",\"#453681\",\"#443781\",\"#443982\",\"#433A83\",\"#433B83\",\"#433C84\",\"#423D84\",\"#423E85\",\"#424085\",\"#414186\",\"#414286\",\"#404387\",\"#404487\",\"#3F4587\",\"#3F4788\",\"#3E4888\",\"#3E4989\",\"#3D4A89\",\"#3D4B89\",\"#3D4C89\",\"#3C4D8A\",\"#3C4E8A\",\"#3B508A\",\"#3B518A\",\"#3A528B\",\"#3A538B\",\"#39548B\",\"#39558B\",\"#38568B\",\"#38578C\",\"#37588C\",\"#37598C\",\"#365A8C\",\"#365B8C\",\"#355C8C\",\"#355D8C\",\"#345E8D\",\"#345F8D\",\"#33608D\",\"#33618D\",\"#32628D\",\"#32638D\",\"#31648D\",\"#31658D\",\"#31668D\",\"#30678D\",\"#30688D\",\"#2F698D\",\"#2F6A8D\",\"#2E6B8E\",\"#2E6C8E\",\"#2E6D8E\",\"#2D6E8E\",\"#2D6F8E\",\"#2C708E\",\"#2C718E\",\"#2C728E\",\"#2B738E\",\"#2B748E\",\"#2A758E\",\"#2A768E\",\"#2A778E\",\"#29788E\",\"#29798E\",\"#287A8E\",\"#287A8E\",\"#287B8E\",\"#277C8E\",\"#277D8E\",\"#277E8E\",\"#267F8E\",\"#26808E\",\"#26818E\",\"#25828E\",\"#25838D\",\"#24848D\",\"#24858D\",\"#24868D\",\"#23878D\",\"#23888D\",\"#23898D\",\"#22898D\",\"#228A8D\",\"#228B8D\",\"#218C8D\",\"#218D8C\",\"#218E8C\",\"#208F8C\",\"#20908C\",\"#20918C\",\"#1F928C\",\"#1F938B\",\"#1F948B\",\"#1F958B\",\"#1F968B\",\"#1E978A\",\"#1E988A\",\"#1E998A\",\"#1E998A\",\"#1E9A89\",\"#1E9B89\",\"#1E9C89\",\"#1E9D88\",\"#1E9E88\",\"#1E9F88\",\"#1EA087\",\"#1FA187\",\"#1FA286\",\"#1FA386\",\"#20A485\",\"#20A585\",\"#21A685\",\"#21A784\",\"#22A784\",\"#23A883\",\"#23A982\",\"#24AA82\",\"#25AB81\",\"#26AC81\",\"#27AD80\",\"#28AE7F\",\"#29AF7F\",\"#2AB07E\",\"#2BB17D\",\"#2CB17D\",\"#2EB27C\",\"#2FB37B\",\"#30B47A\",\"#32B57A\",\"#33B679\",\"#35B778\",\"#36B877\",\"#38B976\",\"#39B976\",\"#3BBA75\",\"#3DBB74\",\"#3EBC73\",\"#40BD72\",\"#42BE71\",\"#44BE70\",\"#45BF6F\",\"#47C06E\",\"#49C16D\",\"#4BC26C\",\"#4DC26B\",\"#4FC369\",\"#51C468\",\"#53C567\",\"#55C666\",\"#57C665\",\"#59C764\",\"#5BC862\",\"#5EC961\",\"#60C960\",\"#62CA5F\",\"#64CB5D\",\"#67CC5C\",\"#69CC5B\",\"#6BCD59\",\"#6DCE58\",\"#70CE56\",\"#72CF55\",\"#74D054\",\"#77D052\",\"#79D151\",\"#7CD24F\",\"#7ED24E\",\"#81D34C\",\"#83D34B\",\"#86D449\",\"#88D547\",\"#8BD546\",\"#8DD644\",\"#90D643\",\"#92D741\",\"#95D73F\",\"#97D83E\",\"#9AD83C\",\"#9DD93A\",\"#9FD938\",\"#A2DA37\",\"#A5DA35\",\"#A7DB33\",\"#AADB32\",\"#ADDC30\",\"#AFDC2E\",\"#B2DD2C\",\"#B5DD2B\",\"#B7DD29\",\"#BADE27\",\"#BDDE26\",\"#BFDF24\",\"#C2DF22\",\"#C5DF21\",\"#C7E01F\",\"#CAE01E\",\"#CDE01D\",\"#CFE11C\",\"#D2E11B\",\"#D4E11A\",\"#D7E219\",\"#DAE218\",\"#DCE218\",\"#DFE318\",\"#E1E318\",\"#E4E318\",\"#E7E419\",\"#E9E419\",\"#ECE41A\",\"#EEE51B\",\"#F1E51C\",\"#F3E51E\",\"#F6E61F\",\"#F8E621\",\"#FAE622\",\"#FDE724\"]},\"id\":\"21551\",\"type\":\"LogColorMapper\"},{\"attributes\":{\"toolbar\":{\"id\":\"21615\",\"type\":\"ProxyToolbar\"},\"toolbar_location\":\"above\"},\"id\":\"21616\",\"type\":\"ToolbarBox\"},{\"attributes\":{},\"id\":\"21609\",\"type\":\"YearsTicker\"},{\"attributes\":{\"children\":[[{\"id\":\"21552\",\"subtype\":\"Figure\",\"type\":\"Plot\"},0,0]]},\"id\":\"21614\",\"type\":\"GridBox\"},{\"attributes\":{\"callback\":null,\"data\":{\"dh\":[1643047200000.0,1643050800000.0,1643054400000.0,1643058000000.0,1643061600000.0],\"dw\":[1643047200000.0,1643050800000.0,1643054400000.0,1643058000000.0,1643061600000.0],\"im\":[1643047200000.0,1643050800000.0,1643054400000.0,1643058000000.0,1643061600000.0],\"x\":[1643047200000.0,1643050800000.0,1643054400000.0,1643058000000.0,1643061600000.0],\"y\":[1643047200000.0,1643050800000.0,1643054400000.0,1643058000000.0,1643061600000.0]},\"selected\":{\"id\":\"21612\",\"type\":\"Selection\"},\"selection_policy\":{\"id\":\"21611\",\"type\":\"UnionRenderers\"}},\"id\":\"21550\",\"type\":\"ColumnDataSource\"},{\"attributes\":{},\"id\":\"21572\",\"type\":\"WheelZoomTool\"},{\"attributes\":{\"mantissas\":[1,2,5],\"max_interval\":500.0,\"num_minor_ticks\":0},\"id\":\"21598\",\"type\":\"AdaptiveTicker\"},{\"attributes\":{\"days\":[1,15]},\"id\":\"21604\",\"type\":\"DaysTicker\"},{\"attributes\":{\"months\":[0,4,8]},\"id\":\"21607\",\"type\":\"MonthsTicker\"},{\"attributes\":{\"days\":[1,4,7,10,13,16,19,22,25,28]},\"id\":\"21602\",\"type\":\"DaysTicker\"},{\"attributes\":{\"formatter\":{\"id\":\"21596\",\"type\":\"BasicTickFormatter\"},\"ticker\":{\"id\":\"21567\",\"type\":\"BasicTicker\"}},\"id\":\"21566\",\"type\":\"LinearAxis\"},{\"attributes\":{\"color_mapper\":{\"id\":\"21551\",\"type\":\"LogColorMapper\"},\"dh\":{\"units\":\"data\",\"value\":1},\"dw\":{\"units\":\"data\",\"value\":1},\"image\":{\"field\":\"im\"},\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y\"}},\"id\":\"21585\",\"type\":\"Image\"},{\"attributes\":{\"months\":[0,1,2,3,4,5,6,7,8,9,10,11]},\"id\":\"21605\",\"type\":\"MonthsTicker\"},{\"attributes\":{\"text\":\"\"},\"id\":\"21591\",\"type\":\"Title\"},{\"attributes\":{\"dimension\":1,\"ticker\":{\"id\":\"21567\",\"type\":\"BasicTicker\"}},\"id\":\"21570\",\"type\":\"Grid\"},{\"attributes\":{\"days\":[1,8,15,22]},\"id\":\"21603\",\"type\":\"DaysTicker\"},{\"attributes\":{\"num_minor_ticks\":5,\"tickers\":[{\"id\":\"21598\",\"type\":\"AdaptiveTicker\"},{\"id\":\"21599\",\"type\":\"AdaptiveTicker\"},{\"id\":\"21600\",\"type\":\"AdaptiveTicker\"},{\"id\":\"21601\",\"type\":\"DaysTicker\"},{\"id\":\"21602\",\"type\":\"DaysTicker\"},{\"id\":\"21603\",\"type\":\"DaysTicker\"},{\"id\":\"21604\",\"type\":\"DaysTicker\"},{\"id\":\"21605\",\"type\":\"MonthsTicker\"},{\"id\":\"21606\",\"type\":\"MonthsTicker\"},{\"id\":\"21607\",\"type\":\"MonthsTicker\"},{\"id\":\"21608\",\"type\":\"MonthsTicker\"},{\"id\":\"21609\",\"type\":\"YearsTicker\"}]},\"id\":\"21562\",\"type\":\"DatetimeTicker\"},{\"attributes\":{},\"id\":\"21612\",\"type\":\"Selection\"},{\"attributes\":{},\"id\":\"21557\",\"type\":\"LinearScale\"},{\"attributes\":{},\"id\":\"21567\",\"type\":\"BasicTicker\"},{\"attributes\":{},\"id\":\"21559\",\"type\":\"LinearScale\"},{\"attributes\":{},\"id\":\"21571\",\"type\":\"PanTool\"},{\"attributes\":{},\"id\":\"21594\",\"type\":\"DatetimeTickFormatter\"},{\"attributes\":{\"below\":[{\"id\":\"21561\",\"type\":\"DatetimeAxis\"}],\"center\":[{\"id\":\"21565\",\"type\":\"Grid\"},{\"id\":\"21570\",\"type\":\"Grid\"}],\"left\":[{\"id\":\"21566\",\"type\":\"LinearAxis\"}],\"plot_height\":400,\"plot_width\":1000,\"renderers\":[{\"id\":\"21587\",\"type\":\"GlyphRenderer\"}],\"right\":[{\"id\":\"21589\",\"type\":\"ColorBar\"}],\"title\":{\"id\":\"21591\",\"type\":\"Title\"},\"toolbar\":{\"id\":\"21577\",\"type\":\"Toolbar\"},\"toolbar_location\":null,\"x_range\":{\"id\":\"21553\",\"type\":\"DataRange1d\"},\"x_scale\":{\"id\":\"21557\",\"type\":\"LinearScale\"},\"y_range\":{\"id\":\"21555\",\"type\":\"DataRange1d\"},\"y_scale\":{\"id\":\"21559\",\"type\":\"LinearScale\"}},\"id\":\"21552\",\"subtype\":\"Figure\",\"type\":\"Plot\"},{\"attributes\":{},\"id\":\"21575\",\"type\":\"ResetTool\"},{\"attributes\":{\"callback\":null},\"id\":\"21555\",\"type\":\"DataRange1d\"},{\"attributes\":{},\"id\":\"21576\",\"type\":\"HelpTool\"},{\"attributes\":{\"months\":[0,6]},\"id\":\"21608\",\"type\":\"MonthsTicker\"},{\"attributes\":{\"days\":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31]},\"id\":\"21601\",\"type\":\"DaysTicker\"},{\"attributes\":{},\"id\":\"21592\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{\"base\":24,\"mantissas\":[1,2,4,6,8,12],\"max_interval\":43200000.0,\"min_interval\":3600000.0,\"num_minor_ticks\":0},\"id\":\"21600\",\"type\":\"AdaptiveTicker\"},{\"attributes\":{\"active_drag\":\"auto\",\"active_inspect\":\"auto\",\"active_multi\":null,\"active_scroll\":\"auto\",\"active_tap\":\"auto\",\"tools\":[{\"id\":\"21571\",\"type\":\"PanTool\"},{\"id\":\"21572\",\"type\":\"WheelZoomTool\"},{\"id\":\"21573\",\"type\":\"BoxZoomTool\"},{\"id\":\"21574\",\"type\":\"SaveTool\"},{\"id\":\"21575\",\"type\":\"ResetTool\"},{\"id\":\"21576\",\"type\":\"HelpTool\"}]},\"id\":\"21577\",\"type\":\"Toolbar\"},{\"attributes\":{\"base\":60,\"mantissas\":[1,2,5,10,15,20,30],\"max_interval\":1800000.0,\"min_interval\":1000.0,\"num_minor_ticks\":0},\"id\":\"21599\",\"type\":\"AdaptiveTicker\"},{\"attributes\":{\"source\":{\"id\":\"21550\",\"type\":\"ColumnDataSource\"}},\"id\":\"21588\",\"type\":\"CDSView\"},{\"attributes\":{\"data_source\":{\"id\":\"21550\",\"type\":\"ColumnDataSource\"},\"glyph\":{\"id\":\"21585\",\"type\":\"Image\"},\"hover_glyph\":null,\"muted_glyph\":null,\"nonselection_glyph\":{\"id\":\"21586\",\"type\":\"Image\"},\"selection_glyph\":null,\"view\":{\"id\":\"21588\",\"type\":\"CDSView\"}},\"id\":\"21587\",\"type\":\"GlyphRenderer\"},{\"attributes\":{},\"id\":\"21593\",\"type\":\"BasicTicker\"},{\"attributes\":{\"bottom_units\":\"screen\",\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"left_units\":\"screen\",\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"render_mode\":\"css\",\"right_units\":\"screen\",\"top_units\":\"screen\"},\"id\":\"21610\",\"type\":\"BoxAnnotation\"},{\"attributes\":{},\"id\":\"21611\",\"type\":\"UnionRenderers\"},{\"attributes\":{\"color_mapper\":{\"id\":\"21551\",\"type\":\"LogColorMapper\"},\"formatter\":{\"id\":\"21592\",\"type\":\"BasicTickFormatter\"},\"label_standoff\":12,\"ticker\":{\"id\":\"21593\",\"type\":\"BasicTicker\"}},\"id\":\"21589\",\"type\":\"ColorBar\"},{\"attributes\":{\"callback\":null},\"id\":\"21553\",\"type\":\"DataRange1d\"},{\"attributes\":{},\"id\":\"21574\",\"type\":\"SaveTool\"},{\"attributes\":{},\"id\":\"21596\",\"type\":\"BasicTickFormatter\"}],\"root_ids\":[\"21617\"]},\"title\":\"Bokeh Application\",\"version\":\"1.3.4\"}};\n var render_items = [{\"docid\":\"9effa7f3-929a-45ae-af79-1be68dfb3616\",\"roots\":{\"21617\":\"514cfeda-ca8f-450c-aa55-e38e27fd6f28\"}}];\n root.Bokeh.embed.embed_items_notebook(docs_json, render_items);\n\n }\n if (root.Bokeh !== undefined) {\n embed_document(root);\n } else {\n var attempts = 0;\n var timer = setInterval(function(root) {\n if (root.Bokeh !== undefined) {\n embed_document(root);\n clearInterval(timer);\n }\n attempts++;\n if (attempts > 100) {\n console.log(\"Bokeh: ERROR: Unable to run BokehJS code because BokehJS library is missing\");\n clearInterval(timer);\n }\n }, 10, root)\n }\n})(window);</script>\n"

I get some other structure from the server side as the result and I don’t ever see any data 2d placeholder for image

First of all, I have to apologize. The dw and dh arguments for image do not spec the width/height of each pixel, but the height/width of the ENTIRE image. I ALWAYS get that mixed up! Sorry about that.

That said, I still see a few potential issues:

  1. Ensure your dataframe is parsing “window_time” as a datetime series.
  2. Ensure you’re specifying timedelta/datetime values for the x and dw arguments.
  3. You aren’t using the pandas pivot method.
  4. You have not populated the image’s ColumnDataSource correctly.

I’ve made two examples here (dummy data), one using the rect glyph (left) and another using the image glyph (right), both producing the same plot (aside from some default formatting stuff). The point I was trying to make should be apparent → with the rect you don’t have to transform the dataframe into a 2D array. Take a close look/walk through this → hopefully it lays things out for you.

# -*- coding: utf-8 -*-
"""
Created on Sat Jan 29 18:45:36 2022

@author: harol
"""

import pandas as pd
import numpy as np
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource, LogColorMapper, ColorBar
#will want to use transform if using the rect glyph
from bokeh.transform import log_cmap

#using transform cmap --> this will return a dictionary that bokeh uses to map a field name to a color mapper
# color_mapper = LogColorMapper(palette="Viridis256", low=1, high=10)
color_mapper = log_cmap('index','Viridis256',low=1,high=10)

#construct flat table similar to what you have
dfx = pd.DataFrame(data={'window_time':pd.date_range(start='Jan 1 2020',end='Jan 10 2020')})
dfy = pd.DataFrame(data={'delayWindowEnd':np.arange(0,len(dfx))})
df = pd.merge(dfx,dfy,how='cross')
df['index'] =np.random.random(len(df))*10

#rect glyph example --> make columndatasource straight from your flat table
src= ColumnDataSource(df)
plot = figure(toolbar_location=None,x_axis_type='datetime')
#add rect glyph/renderer, pointing the glyph to the fields in the source for the xy,
 # and scalar "hard coded" width set to one day and height of 1
 
plot.rect(x='window_time', y='delayWindowEnd' #coords for rect centers
          , width=pd.to_timedelta('1D'),height=1,fill_color=color_mapper,source=src)
#build the colorbar using the mapper
color_bar = ColorBar(color_mapper=color_mapper['transform'], label_standoff=12)
plot.add_layout(color_bar, 'right')

#image glyph example
implot = figure(toolbar_location=None,x_axis_type='datetime')
#example of how to pivot the dataframe
piv = df.pivot(index='delayWindowEnd',columns='window_time',values='index')
#make a different cds that'll use the pivotted dataframe
#shifting the 'x' half a day and the y 0.5 to mimic the rect center coords
imsrc = ColumnDataSource(data={'x':[pd.to_datetime('Jan 1 2020')-pd.to_timedelta('0.5D')] #left most
                               ,'y':[-0.5] #bottom most 
                               ,'dw':[df['window_time'].max()-df['window_time'].min()] #TOTAL width of image
                               ,'dh':[df['delayWindowEnd'].max()] #TOTAL height of image
                               ,'im':[piv.to_numpy()] #2D array using to_numpy() method on pivotted df
                               })
#image glyph/renderer using imsrc
implot.image(x='x', y='y', source=imsrc, image='im',dw='dw',dh='dh',  color_mapper=color_mapper['transform'])
color_bar = ColorBar(color_mapper=color_mapper['transform'], label_standoff=12)
implot.add_layout(color_bar, 'right')

from bokeh.layouts import layout
show(layout([[plot,implot]]))

Thanks for a very detailed explanation! As a very early step I just try to reproduce your code in Apache Zeppelin

I’m getting the below exception on merge:


---------------------------------------------------------------------------
MergeError                                Traceback (most recent call last)
<ipython-input-596-a9917aea4e22> in <module>
      8 
      9 for sensor in sensors:
---> 10     plot_summaries(sensor, "maxmin2")

<ipython-input-593-42e666054517> in plot_summaries(sensor, dfName)
     15     dfx = pd.DataFrame(data={'window_time':pd.date_range(start='Jan 24 2022',end='Jan 25 2022')})
     16     dfy = pd.DataFrame(data={'delayWindowEnd':np.arange(0,len(dfx))})
---> 17     df = pd.merge(dfx,dfy,how='cross')
     18     df['index'] =np.random.random(len(df))*10
     19 

/opt/conda/lib/python3.7/site-packages/pandas/core/reshape/merge.py in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)
     80         copy=copy,
     81         indicator=indicator,
---> 82         validate=validate,
     83     )
     84     return op.get_result()

/opt/conda/lib/python3.7/site-packages/pandas/core/reshape/merge.py in __init__(self, left, right, how, on, left_on, right_on, axis, left_index, right_index, sort, suffixes, copy, indicator, validate)
    618             warnings.warn(msg, UserWarning)
    619 
--> 620         self._validate_specification()
    621 
    622         # note this function has side effects

/opt/conda/lib/python3.7/site-packages/pandas/core/reshape/merge.py in _validate_specification(self)
   1189                             ron=self.right_on,
   1190                             lidx=self.left_index,
-> 1191                             ridx=self.right_index,
   1192                         )
   1193                     )

MergeError: No common columns to perform merge on. Merge options: left_on=None, right_on=None, left_index=False, right_index=False

Just had older than 1.2 pandas for some reason

Have find the workaround

dfx[‘key’] = 0
dfy[‘key’] = 0
df = dfx.merge(dfy,how=‘outer’)
df.drop(columns=[‘key’])

and now have this example rendering fine

1 Like

I have succeed with that, thank you so much!

1 Like
    dft = sqlContext.table(dfName)
    pdf = dft.toPandas()
    import pandas as pd
    rowIDs = pdf['values']
    colIDs = pdf['window_time']

    A = pdf.pivot_table('index', 'values', 'window_time', fill_value=0)
    source = ColumnDataSource(data={'x':[pd.to_datetime('Jan 24 2022')] #left most
                               ,'y':[0] #bottom most 
                               ,'dw':[pdf['window_time'].max()-pdf['window_time'].min()] #TOTAL width of image
                               #,'dh':[df['delayWindowEnd'].max()] #TOTAL height of image
                               ,'dh':[1000] #TOTAL height of image
                               ,'im':[A.to_numpy()] #2D array using to_numpy() method on pivotted df
                               })

    color_mapper = LogColorMapper(palette="Viridis256", low=1, high=20)

    plot = figure(toolbar_location=None,x_axis_type='datetime')
    plot.image(x='x', y='y', source=source, image='im',dw='dw',dh='dh',  color_mapper=color_mapper)

    color_bar = ColorBar(color_mapper=color_mapper, label_standoff=12)

    plot.add_layout(color_bar, 'right')
    show(gridplot([plot], ncols=1, plot_width=1000, plot_height=400))

Heh, I was so happy when pandas finally implemented a cross join - I got really really tired of making those dummy columns :slight_smile:

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.