Long delay updating image with remote connection

Hi I’m running a bokeh server application updating an image interactively. It loads reasonably quickly with a local connection but over a remote connection there is a delay of 20-50 seconds. Trying to understand what is going on. The array being plotted is 16bit 1408x1152 grayscale and is mapped to Inferno256. Saved as an image file it should be at most 5MB of data and given I can transfer files from the same server at 3MB/s I am only expecting a few seconds of delay from the remote factor. The figure itself is 1100x900 and using bokeh save tool results in a 1MB png.

figure initialization
p = figure(width=1100, height=900, x_range=(0.5, 11.5), y_range=(0.5, 9.5))
mapper = LinearColorMapper(palette=Inferno256, low=0)
source = ColumnDataSource({‘image’: np.arange(1408*1152).reshape((1152, 1408))})
r = p.image(image=‘image’, source=source, x=0.5, y=0.5, dw=11, dh=9, color_mapper=mapper)
color_bar = ColorBar(color_mapper=mapper, location=(0,0), ticker=BasicTicker()))
p.add_layout(color_bar, ‘right’)

``

update

stitching 1024 x 1024 images together then downsampling by factor of 8

stitch = np.zeros((11528, 14088))
for indx, path in s.iteritems():
img = skimage.io.imread(path)
j = indx % 9
i = indx // 9
stitch[j:j+1024, i:i+1024] = img

downsampling

stitch = skimage.measure.block_reduce(stitch, block_size=(8, 8), func=np.mean)

r.data_source.data = ColumnDataSource({‘image’: [stitch[::-1]]}).data
p.title.text = ‘%s’ % pd.Timestamp(‘now’).strftime(’%Y%m%d %H:%M:%S’)

print ‘done updating plot’, pd.Timestamp(‘now’)

``

This takes about 5 seconds to update on my server, ~2s to load images and downsample and ~3s to update the data. With a local connection on the server the figure will update momentarily. With a remote connection the update can be as long as 50seconds later

Hi,

Bokeh's serialization strategy is basically "JSON all the things". This tends to actually work reasonably in alot of cases, but images is probably one where it suffers. Your your packed 16 bit array is converted into a "2d" list of lists of JSON doubles. It's going to be quite a bit larger than your source image array, and there is the overhead from the JSON conversion itself on both ends. There is an open issue to add support for a packed binary protocol for arrays, which should help in many cases. Although perhaps not as much in yours, the binary data has to be in a form that can be immediately put into a JS typed array, and there are no 16 bit array types AFAIK.

This is speculation, there might be some other root cause. Do you have a complete runnable toy test case that could be used to investigate (at some point in the future)?

Thanks,

Bryan

···

On Sep 23, 2016, at 3:17 AM, John Liu <[email protected]> wrote:

Hi I'm running a bokeh server application updating an image interactively. It loads reasonably quickly with a local connection but over a remote connection there is a delay of 20-50 seconds. Trying to understand what is going on. The array being plotted is 16bit 1408x1152 grayscale and is mapped to Inferno256. Saved as an image file it should be at most 5MB of data and given I can transfer files from the same server at 3MB/s I am only expecting a few seconds of delay from the remote factor. The figure itself is 1100x900 and using bokeh save tool results in a 1MB png.

figure initialization
p = figure(width=1100, height=900, x_range=(0.5, 11.5), y_range=(0.5, 9.5))
mapper = LinearColorMapper(palette=Inferno256, low=0)
source = ColumnDataSource({'image': np.arange(1408*1152).reshape((1152, 1408))})
r = p.image(image='image', source=source, x=0.5, y=0.5, dw=11, dh=9, color_mapper=mapper)
color_bar = ColorBar(color_mapper=mapper, location=(0,0), ticker=BasicTicker()))
p.add_layout(color_bar, 'right')

update
# stitching 1024 x 1024 images together then downsampling by factor of 8
stitch = np.zeros((1152*8, 1408*8))
for indx, path in s.iteritems():
    img = skimage.io.imread(path)
    j = indx % 9
    i = indx // 9
    stitch[j:j+1024, i:i+1024] = img

# downsampling
stitch = skimage.measure.block_reduce(stitch, block_size=(8, 8), func=np.mean)

r.data_source.data = ColumnDataSource({'image': [stitch[::-1]]}).data
p.title.text = '%s' % pd.Timestamp('now').strftime('%Y%m%d %H:%M:%S')

print 'done updating plot', pd.Timestamp('now')

This takes about 5 seconds to update on my server, ~2s to load images and downsample and ~3s to update the data. With a local connection on the server the figure will update momentarily. With a remote connection the update can be as long as 50seconds later

--
You received this message because you are subscribed to the Google Groups "Bokeh Discussion - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/cf1c467a-15f1-436f-b737-0fe8067ca755%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.

Thanks for your reply Bryan. Delay is down to about 10s now and coincides with falloff of network traffic. I should have watched my network traffic earlier, but I think the delay now is reasonable given my network speed and what you’re saying. I’ll continue to monitor, not sure if test code would still be of interest given this information?

I’m interested in using datashader with bokeh for pan and zoom features. It sounds like the bandwidth requirements would be the same and I would either need a faster network or reduced resolution to serve remote users with less delay. Is this right or would converting my array to something else allow JSON to use a smaller datatype for the image? Also given adequate figure width+height, is there a way for the figure to display the image in the given resolution without additional upsampling computation to fill the figure?

Thanks,
John

···

On Friday, September 23, 2016 at 7:18:33 AM UTC-7, Bryan Van de ven wrote:

Hi,

Bokeh’s serialization strategy is basically “JSON all the things”. This tends to actually work reasonably in alot of cases, but images is probably one where it suffers. Your your packed 16 bit array is converted into a “2d” list of lists of JSON doubles. It’s going to be quite a bit larger than your source image array, and there is the overhead from the JSON conversion itself on both ends. There is an open issue to add support for a packed binary protocol for arrays, which should help in many cases. Although perhaps not as much in yours, the binary data has to be in a form that can be immediately put into a JS typed array, and there are no 16 bit array types AFAIK.

This is speculation, there might be some other root cause. Do you have a complete runnable toy test case that could be used to investigate (at some point in the future)?

Thanks,

Bryan

On Sep 23, 2016, at 3:17 AM, John Liu [email protected] wrote:

Hi I’m running a bokeh server application updating an image interactively. It loads reasonably quickly with a local connection but over a remote connection there is a delay of 20-50 seconds. Trying to understand what is going on. The array being plotted is 16bit 1408x1152 grayscale and is mapped to Inferno256. Saved as an image file it should be at most 5MB of data and given I can transfer files from the same server at 3MB/s I am only expecting a few seconds of delay from the remote factor. The figure itself is 1100x900 and using bokeh save tool results in a 1MB png.

figure initialization

p = figure(width=1100, height=900, x_range=(0.5, 11.5), y_range=(0.5, 9.5))

mapper = LinearColorMapper(palette=Inferno256, low=0)

source = ColumnDataSource({‘image’: np.arange(1408*1152).reshape((1152, 1408))})

r = p.image(image=‘image’, source=source, x=0.5, y=0.5, dw=11, dh=9, color_mapper=mapper)

color_bar = ColorBar(color_mapper=mapper, location=(0,0), ticker=BasicTicker()))

p.add_layout(color_bar, ‘right’)

update

stitching 1024 x 1024 images together then downsampling by factor of 8

stitch = np.zeros((11528, 14088))

for indx, path in s.iteritems():

img = skimage.io.imread(path)
j = indx % 9
i = indx // 9
stitch[j:j+1024, i:i+1024] = img

downsampling

stitch = skimage.measure.block_reduce(stitch, block_size=(8, 8), func=np.mean)

r.data_source.data = ColumnDataSource({‘image’: [stitch[::-1]]}).data

p.title.text = ‘%s’ % pd.Timestamp(‘now’).strftime(’%Y%m%d %H:%M:%S’)

print ‘done updating plot’, pd.Timestamp(‘now’)

This takes about 5 seconds to update on my server, ~2s to load images and downsample and ~3s to update the data. With a local connection on the server the figure will update momentarily. With a remote connection the update can be as long as 50seconds later


You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/bokeh/cf1c467a-15f1-436f-b737-0fe8067ca755%40continuum.io.

For more options, visit https://groups.google.com/a/continuum.io/d/optout.