Search code examples
algorithmmathstatisticsstandard-deviationonline-algorithm

Online Algorithm for Standard Deviation Proof


I saw this algorithm in an answer to this question.

Does this correctly calculate standard deviation? Can someone walk me through why this works mathematically? Preferably working back from this formula:

enter image description here

public class Statistics {

    private int n;
    private double sum;
    private double sumsq;

    public void reset() {
        this.n = 0;
        this.sum = 0.0;
        this.sumsq = 0.0;
    }

    public synchronized void addValue(double x) {
        ++this.n;
        this.sum += x;
        this.sumsq += x*x;
    }

    public synchronized double calculateMean() {
        double mean = 0.0;
        if (this.n > 0) {
            mean = this.sum/this.n;
        }
        return mean;
    }

    public synchronized double calculateStandardDeviation() {
        double deviation = 0.0;
        if (this.n > 1) {
            deviation = Math.sqrt((this.sumsq - this.sum*this.sum/this.n)/(this.n-1));
        }
        return deviation;
    }
}

Solution

  • There is a proof on wikipedia at the start of the section I linked to.

    enter image description here

    By the way, I remember from somewhere that calculating this way can produce more error. As you can see this.sumsq can become huge. Whereas calculating the normal way always has smaller intermediate values.

    Anyway, I do use this online calculation a lot, because most of the time error didn't matter that much.