Search code examples
c++arraysaveragevariancestandard-deviation

Calculating Standard Deviation & Variance in C++


So, I've posted a few times and previously my problems were pretty vague. I started C++ this week and have been doing a little project.

I'm trying to calculate standard deviation & variance. My code loads a file of 100 integers and puts them into an array, counts them, calculates the mean, sum, variance and SD. But I'm having a little trouble with the variance.

I keep getting a huge number - I have a feeling it's to do with its calculation.

My mean and sum are ok.

NB:

sd & mean calcs

using namespace std;

int main() {
    int n = 0;
    int Array[100];
    float mean;
    float var, sd;
    string line;
    float numPoints;

    ifstream myfile("numbers.txt");

    if (myfile.is_open()) {
        while (!myfile.eof()) {
            getline(myfile, line);
            
            stringstream convert(line);
        
            if (!(convert >> Array[n])) {
                Array[n] = 0;
            }

            cout << Array[n] << endl;
            n++;
        }
    
        myfile.close();
        numPoints = n;
    } else
        cout << "Error loading file" << endl;

    int sum = accumulate(begin(Array), end(Array), 0, plus<int>());
    cout << "The sum of all integers: " << sum << endl;

    mean = sum / numPoints;
    cout << "The mean of all integers: " << mean << endl;

    var = (Array[n] - mean) * (Array[n] - mean) / numPoints;
    sd = sqrt(var);
    cout << "The standard deviation is: " << sd << endl;

    return 0;
}

Solution

  • As the other answer by horseshoe correctly suggests, you will have to use a loop to calculate variance otherwise the statement

    var = ((Array[n] - mean) * (Array[n] - mean)) / numPoints;

    will just consider a single element from the array.

    Just improved horseshoe's suggested code:

    var = 0;
    for( n = 0; n < numPoints; n++ )
    {
      var += (Array[n] - mean) * (Array[n] - mean);
    }
    var /= numPoints;
    sd = sqrt(var);
    

    Your sum works fine even without using loop because you are using accumulate function which already has a loop inside it, but which is not evident in the code, take a look at the equivalent behavior of accumulate for a clear understanding of what it is doing.

    Note: X ?= Y is short for X = X ? Y where ? can be any operator. Also you can use pow(Array[n] - mean, 2) to take the square instead of multiplying it by itself making it more tidy.