Search code examples
calgorithmaverage

Calculate average value of a sensor data stream


Is there a feasible way to calculate a running average of an incoming stream of sensor data? I get multiple samples a second and the stream can run for several hours, so accumulating the values and keeping track of the amount of samples seems not feasible. Is there a more pragmatic approach to this problem, maybe with the sacrifice of accuracy? The best solution I came up so far is an IIR implementation as x[n] = 0.99*x[n-1] + 0.01*y but this is not really giving the average as I need it.


Solution

  • Calculating an exact average requires the sum of samples and a count of samples. There is not really a way around that.

    Keeping an average, A_N, and updating it for a new sample:

    A_{N+1} = (A_N * N + s_{N+1}) / (N+1)

    is completely equivalent to keeping a sum (which is equal to the term A_N * N) so that is not a solution.

    The potential problem with keeping a sum of samples is that the exact value may exceed the number of significant bits in the representation (whether it is integer or floating-point).

    To get around that (in case the maximal integer size is not sufficient), either a library for arbitrary large integers or a "home-made" solution can be used.

    A home-made solution could be to keep "buckets of sums" either with a fixed number of samples in each bucket or with a count per bucket. The average could then be calculated as a weighed average of the per-bucket averages using floating-point calculations.