Search code examples
matlablinear-regressioncurve-fittingleast-squarespolynomials

Linear regression (regress) discrepancy with polynomial fit (polyfit)


I have some data which comes from a linear function (y=mx+c) where m=4, c=1 (so: y=4x+1).

When I try to get back the coefficients using regress, I'm getting an R2<1 and a seemingly random m value:

x = [1 2 3 4]
y = [5 9 13 17]
[m,bint,r,rint,stats] = regress(y',x');

%{
>> R = stats(1) % Coefficient of determination
R =
     1
>> m % Linear function coefficients
m = 
     4.333333333333333
%}

Whereas polyfit does this correctly:

P = polyfit(x,y,1);

%{
>> P(1)
ans =
    4.000000000000000
>> P(2)
ans =
    1.000000000000000
%}

Why is this happening?


Solution

  • The source of your problem is not following the documentation or regress which states that:

    b = regress(y,X) returns a vector b of coefficient estimates for a multiple linear regression of the responses in vector y on the predictors in matrix X. The matrix X must include a column of ones.

    If we include a column of ones in the 2nd input, we get the desired result:

    x = [1 2 3 4].';
    y = [5 9 13 17].';
    [m,bint,r,rint,stats] = regress(y,[ones(size(x)) x]);
    
    %{
    Results:
    m =
        1.0000
        4.0000
    bint =
        1.0000    1.0000
        4.0000    4.0000
    r =
       1.0e-14 *
        0.1776
        0.1776
        0.1776
             0
    rint =
       1.0e-13 *
        0.0178    0.0178
       -0.2190    0.2545
       -0.2190    0.2545
       -0.2141    0.2141
    stats =
       1.0e+31 *
        0.0000    1.6902    0.0000    0.0000
    %}