I’m new to Bokeh. Is it possible to do aggregation on TimeSeries charts? Say I have a data set with (timestamp, expense amount) tuples. How can I plot mean expenses over time? I could resample the data set beforehand to be (date rang, mean expense amount) and plot that. Can this be done by Bokeh for me?
Easier to aggregate with pandas and plot the result
···
On Tuesday, May 31, 2016 at 1:47:39 PM UTC-4, Jonas Haag wrote:
Hi,
I’m new to Bokeh. Is it possible to do aggregation on TimeSeries charts? Say I have a data set with (timestamp, expense amount) tuples. How can I plot mean expenses over time? I could resample the data set beforehand to be (date rang, mean expense amount) and plot that. Can this be done by Bokeh for me?
I’m new to Bokeh. Is it possible to do aggregation on TimeSeries charts? Say I have a data set with (timestamp, expense amount) tuples. How can I plot mean expenses over time? I could resample the data set beforehand to be (date rang, mean expense amount) and plot that. Can this be done by Bokeh for me?
Thanks
Jonas
–
You received this message because you are subscribed to the Google Groups “Bokeh Discussion - Public” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
Understood, I’m just wondering if this doesn’t actually belong to the plotting component? What if you want to zoom the data further than the aggregation date range? If I understand things correctly I have to decide beforehand what X axis ranges are likely to be viewed? (If my aggregation interval is too small, the plot will be difficult to read; if it is too large, some details will be missing if zoomed in far enough.)
Jonas
···
On 01 Jun 2016, at 17:23, Aaron Goldenberg <[email protected]> wrote:
We are telling you to separate the tasks of data munging and plotting
Bokeh is actually two libraries, the python "bokeh" module and a JavaScript library, BokehJS. When you make a plot with Bokeh (python), you are actually constructing a declarative JSON representation of your visualization, and shipping it to BokehJS to be interpreted and displayed. But this means normally you have to ship all the data ahead of time. We can't do things like scale-dependent re-sampling and aggregation automatically the browser using pandas, because your browser can't run python code. To accomplish that kind of functionality, there are basically two options
* Do it in JavaScript
* Use a Bokeh Server to respond to range updates with real python code
The downside to "doing it in JS" is that BokehJS is already a large library by JS standards. We can't continually bloat core BokehJS by adding every possible imaginable feature, or it will become unusable. On the other hand, it is now possible to extend Bokeh and BokehJS with custom user models, so something like this might make a prime candidate for that. In fact I believe some folks have already made custom models to support scale-dependent downsampling.
Certainly a Bokeh app is also a reasonable approach, but that's something a user would write because the user knows what kind of downsampling or refined is appropriate to the actual data and use case.
If your data is very large, and you want things like automatic progressive refinement and visual queries across large scales, you might want to look into the related (OSS) DataShader project, that integrates closely with Bokeh:
On 01 Jun 2016, at 17:23, Aaron Goldenberg <[email protected]> wrote:
We are telling you to separate the tasks of data munging and plotting
Understood, I’m just wondering if this doesn’t actually belong to the plotting component? What if you want to zoom the data further than the aggregation date range? If I understand things correctly I have to decide beforehand what X axis ranges are likely to be viewed? (If my aggregation interval is too small, the plot will be difficult to read; if it is too large, some details will be missing if zoomed in far enough.)
thanks for the detailed answer. I’ve had a look at Datashader and I gotta say it looks very promising! Thanks for the pointer!
···
On 01 Jun 2016, at 17:51, Bryan Van de Ven <[email protected]> wrote:
The downside to "doing it in JS" is that BokehJS is already a large library by JS standards. We can't continually bloat core BokehJS by adding every possible imaginable feature, or it will become unusable.
Fair enough. To be clear, the problem I’m having with lots of samples isn’t performance; it’s that I can’t make sense of the data because it’s just too many data points and time series charts don’t support client-side aggregation/binning. For binning of course a histogram suits best, but Bokeh’s charts.Histogram doesn’t seem to support date time on the X axis. (I’m also not quite sure it supports client-side binning…)