A running process is being benchmarked, where the following numbers are calculated "cumulatively" (from previous values + the new element):
The Standard Deviation (σ) of the duration is supposed to be calculated as well, because it indicates the statistical dispersion.
Based on the application, it is inconvenient to store each and every element/number, so it is necessary to calculate it with respect to the previous value and the new element.
Example:
NewElement Min Max Avg StdDev // AllElements (which are NOT meant to be stored)
1 1 1 1 - [1]
2 1 2 1.5 0.5 [1,2]
3 1 3 2 0.8164965809277 [1,2,3]
4 1 4 2.5 1.1180339887499 [1,2,3,4]
0 0 4 2 1.4142135623731 [1,2,3,4,0]
(And this is an online calculator of SD, for reference values)
A simplified version of the goal would be:
const calculateNewStats = (stats, newElement) => {
const newStats = {};
newStats.count = stats.count + 1;
newStats.min = Math.min(stats.min, newElement);
newStats.max = Math.max(stats.max, newElement);
newStats.avg = (stats.avg * stats.count + newElement) / newStats.count;
// newStats.sd = ??? that's the problem
return newStats;
};
// initial values
let stats = {
count: 0,
min: 0,
max: 0,
avg: 0,
// initial SD is theoretically controversial (N/A), but that's not the point
sd: 0,
};
// loopStart goes here ... an infinite one
// many things goes here ... eventually, we have a `newElement`
stats = calculateNewStats(stats, newElement);
// loopEnd goes here
A search has been made for some time, some mathematical equations (like this) were found and carefully applied but the resulted numbers were not correct.
The algorithm on the page you linked does work, here is a working implementation:
const calculateNewStats = (stats, newElement) => {
const newStats = {};
newStats.count = stats.count + 1;
newStats.min = Math.min(stats.min, newElement);
newStats.max = Math.max(stats.max, newElement);
newStats.avg = (stats.avg * stats.count + newElement) / newStats.count;
newStats.sd = Math.sqrt(
(
(newStats.count - 1) * stats.sd * stats.sd +
(newElement - newStats.avg) * (newElement - stats.avg)
) / (newStats.count)
);
return newStats;
};
// initial values
let stats = {
count: 0,
min: 0,
max: 0,
avg: 0,
sd: 0
};
let newElements = [1, 2, 3, 4, 0];
for (let newElement of newElements) {
stats = calculateNewStats(stats, newElement);
console.log(stats);
}
Maybe you missed the last sentence?
If you want the population variance or standard deviation replace N-1 with N and N-2 with N-1.
Note: there will be a small loss of precision that will get larger as you add elements. I would advise to:
stats
together with sd
; right now I'm calculating the square root of the variance to get the SD, then squaring the SD to get the variance in the next iterationstats
, instead of recalculating it with stats.avg * stats.count
on every iterationYou're storing 2 more numbers in stats
, but you should be getting a better precision in your numbers.
This is a better implementation:
const calculateNewStats = (stats, newElement) => {
const newStats = {};
newStats.count = stats.count + 1;
newStats.total = stats.total + newElement;
newStats.min = Math.min(stats.min, newElement);
newStats.max = Math.max(stats.max, newElement);
newStats.avg = (stats.total + newElement) / newStats.count;
newStats.variance = (
(newStats.count - 1) * stats.variance +
(newElement - newStats.avg) * (newElement - stats.avg)
) / (newStats.count);
newStats.sd = Math.sqrt(newStats.variance);
return newStats;
};
// initial values
let stats = {
count: 0,
total: 0,
min: 0,
max: 0,
avg: 0,
sd: 0,
variance: 0
};
let newElements = [1, 2, 3, 4, 0];
for (let newElement of newElements) {
stats = calculateNewStats(stats, newElement);
console.log(stats);
}