Let’s assume a pellet-fired meat smoker (cooking device) with temperature setting and a temperature probe. The temperature is maintained automatically by an embedded controller with limited memory and an uptime clock. Uptime (duration of use) will vary and is not known in advance.
Let’s say we want to store temperature probe samples so that at any time during the use, a line-chart of temperature samples can be generated.
Is there a clever/elegant algorithm and data structure for maintaining and culling probe readings which are approximately evenly spaced in time over any duration of cook? Assume that the uptime clock value would require one unit of memory and each sample takes one unit of memory, and there are only M units of memory for this purpose, how can we store the most samples which have optimal spacing over time, culling and replacing the most redundant samples as necessary and the uptime grows? Can it be done without moving samples and without explicitly storing the timestamp with each sample?
Maybe related to “Reservoir Sampling” but not sure that’s a match here because we need to reconstruct when the sample was captured too, not just that the value is in the reservoir somewhere.
A poor solution, I think: If we have 3 units of memory for samples, then taking a reading at every power of two time and storing them in a ring buffer, we'd retain a 25% reading, a 50% reading, and the most recent reading. Waiting double that time again, the previous 25% (oldest reading) would be replaced. The previous 50% reading is now twice as old again and is considered the new 25%, quarter-time sample and the previous reading is the mid-time sample. But this biases our good time resolution to the past and we have hardly any data about recent events.
Let's start with an easy, somewhat sub-optimal solution:
Solution 1: Whenever your buffer fills up, discard every 2nd sample and divide your sampling rate by 2.
This has some nice properties -- you always have a consistent sampling rate across the whole uptime period and your samples are always regularly spaced.
Of course the problem with this solution is that sometimes you are leaving up to half your buffer unused, and if it was all used, you could have almost twice as many samples as you do.
It seems to get better if we just used smaller steps:
Solution 2: Whenever your buffer fills up, discard every 10th sample and multiply your sampling rate by 0.9.
Now you are always using at least 90% of your buffer, which is great, but the spacing of the samples you keep is non-uniform and will end up with some annoying visible patterns in it.
In addition to always making good use of the buffer, we would like the spacing of our samples to be nearly uniform at all times.
This requires a little magic:
Solution 3: Blue noise mask
To fix the remaining problem, we can use a trick called a "blue noise mask". There's a lot you can google about them, but since the technique was invented for digital halftoning (dithering), almost all of it is for 2D applications. It does work just as well for 1D, though. It's also conceptually very simple. Adapted for your specific problem, it works as follows.
Let's say you want to each buffer compression to reduce the number of samples by about 10%. We'll divide the count by about 7^(1/19) each time, so that after 19 compressions, we have divided the sample count by exactly 7. I think the prime numbers will help make the result look more natural.
The blue noise mask we want is just an array of numbers, where each index corresponds to a sample time, such that:
Given such a mask, we have a really good solution to your problem:
When your buffer fills up, apply a compression using the blue noise mask. Starting at 0, for the nth compression, iterate through the mask and:
Note that the samples with entries < (n%19) are the ones that were discarded in previous compressions.
Iterate though the mask repeatedly until you've accounted for every sample in your buffer.
The mask should also be used to control your sampling rate. Always sample at the times that would have been preserved by the last compression.
Making a blue noise mask
This is a really good solution for your problem, but in the requirements listed above for the blue noise mask, this one is hard to meet: "Selecting all the indexes with lifetimes >= x, for any 0 <= x <= 19, always produces an approximately uniform distribution"
To accomplish that feat, start by assigning lifetime 19 to every 7th entry, starting at 0. Then use 1D Poisson disk sampling with appropriately decreasing disk size to gradually fill in the entries for shorter lifetimes.