Search code examples
c++mathlinear-regression

how to get the slope of a linear regression line using c++?


I need to attain the slope of a linear regression similar to the way the Excel function in the below link is implemented:

http://office.microsoft.com/en-gb/excel-help/slope-function-HP010342903.aspx

Is there a library in C++ or a simple coded solution someone has created which can do this?

I have implemented code according to this formula, however it does not always give me the correct results (taken from here http://easycalculation.com/statistics/learn-regression.php) ....

Slope(b) = (NΣXY - (ΣX)(ΣY)) / (NΣX2 - (ΣX)2)
         = ((5)*(1159.7)-(311)*(18.6))/((5)*(19359)-(311)2)
         = (5798.5 - 5784.6)/(96795 - 96721)
         = 13.9/74
         = 0.19 

If I try it against the following vectors, I get the wrong results (I should be expecting 0.305556): x = 6,5,11,7,5,4,4 y = 2,3,9,1,8,7,5

Thanks in advance.


Solution

  • Why don't you just write a simple code like this (not the best solution, for sure, just an example based on the help article):

    double slope(const vector<double>& x, const vector<double>& y){
        if(x.size() != y.size()){
            throw exception("...");
        }
        size_t n = x.size();
    
        double avgX = accumulate(x.begin(), x.end(), 0.0) / n;
        double avgY = accumulate(y.begin(), y.end(), 0.0) / n;
    
        double numerator = 0.0;
        double denominator = 0.0;
    
        for(size_t i=0; i<n; ++i){
            numerator += (x[i] - avgX) * (y[i] - avgY);
            denominator += (x[i] - avgX) * (x[i] - avgX);
        }
    
        if(denominator == 0.0){
            throw exception("...");
        }
    
        return numerator / denominator;
    }
    

    Note that the third argument of accumulate function must be 0.0 rather than 0, otherwise the compiler will deduct its type as int and there are great chances that the result of accumulate calls will be wrong (it's actually wrong using MSVC2010 and mingw-w64 when passing 0 as the third parameter).