I'm programming lines charts with the use of Flot Charts to display timeseries.
In order to reduce the number of points to display, I do a downsampling by applying an average function on every data points in the same hour.
Recently, I however discovered the Largest-Triangle-Three-Buckets algorithm: http://flot.base.is/
What are the differences between using such algorithm against using a simple function like average (per minute, per hour, per day, ...)?
To speed up long period queries, does it make sense to pre-calculate an sql table on server-side, by applying LTTB on each month of data, and let the client-side apply an other LTTB on the agregated data?
1: The problem with averages, for my purposes, is that they nuke large differences between samples- my peaks and valleys were more important than what was happening between them. The point of the 3buckets algorithm is to try to preserve those inflection points (peaks/valleys) while not worrying about showing you all the times the data was similar or the same.
So, in my case, where the data was generally all the same (or close enough-- temperature data) until sample X at which point a small % change was important to be shown in the graph, the buckets algorithm was perfect.
Also- since the buckets algorithm is parameterized, you can change the values (how much data to keep) and see what values nuke the most data while looking visually nearly-identical and decide how much data you can dispense with before your graph has had too much data removed.
The naive approach would be decimation (removing X out of N samples) but what happens if it's the outliers you care about and the algorithm nukes an outlier? So then you change your decimation so that if the difference is -too- great, then it doesn't nuke that sample. This is kind of a more sophisticated version of that concept.
2: depends on how quickly you can compute it all, if the data ever changes, various other factors. That's up to you. From my perspective, once my data was in the past and a sample was 'chosen' to represent the bucket's value, it won't be changed and I can save it and never recalculate again.
Since your question is a bit old, what'd you end up doing?