Search code examples
arraysexcelvbalinear-regression

Polynomial regression with array


It appears that I'm not the only one to struggle with this, but I can't find an good answer, so I try my chance here!

I would like to find a 3rd degree polynomial line of best fit, given two sets of data stored in arrays. Basically, it looks like that

Dim X(0 to 9) as Integer
Dim Y(0 to 9) as Integer
for k = 0 to 9
   X(k) = 'something
   Y(k) = 'something else
Next

So far I've worked out how to solve my issue with a 1st degree polynomial Y = aX + b :

a = Application.WorksheetFunction.LinEst(Y, X, True, False)(1)
b = Application.WorksheetFunction.LinEst(Y, X, True, False)(2)

I also found that if my values X and Y are written into a Sheet, I can find a higher polynomial with this :

 'x-axis values are entered in X column, y-values in Y column
 Z = Application.Evaluate("=linest(Y1:Y10,X1:X10^{1,2,3})")
 'The answer is in Z such that Y = Z(1)*Y^3+Z(2)*Y^2+Z(3)*Y+Z(4)

Assuming my arrays are already sorted, how can I use linest with arrays and not values entered into cells?


Solution

  • If you want the OLS best fit coefficients (i.e. linear regression) and not interpolation, then you can do it like this for a second order polynomial:

    Sub test()
    
        Dim X(0 To 9, 0 To 1) As Integer
        Dim Y(0 To 9, 0 To 0) As Integer
    
    
        i = 0
        For n = 0 To 9
           X(n, 0) = i
           X(n, 1) = i * i
           Y(n, 0) = i * i + 3 * i - 7
           i = i + 1
        Next
    
        B = WorksheetFunction.LinEst(Y, X)
        B2 = B(1)
        B1 = B(2)
        B0 = B(3)
    
    End Sub
    

    This correctly returns -7 for B0, 3 for B1 and 1 for B2. This is pretty much the same as this answer: https://stackoverflow.com/a/27137957/1011724

    You can make it more general, say a k-order polynomial like this:

    Function f(X) As Integer ' This function is replaced by your data
        f = -2 * WorksheetFunction.Power(X, 3) + 3 * X - 7
    End Function
    
    Sub test2()
    
        Order = 3
    
        ReDim X(0 To 9, 0 To Order - 1)
        Dim Y(0 To 9, 0 To 0) As Integer
    
        'Note i is only to generate dummy data
        i = 1
        For n = 0 To 9 'Replace 9 with the length of your data
           X(n, 0) = i 'This line is replaced by your actual data
           ' Create the higher order features:
           For k = 1 To Order
               X(n, k-1) = Application.WorksheetFunction.Power(i, k)
           Next
           Y(n, 0) = f(i) 'This line is replaced by your actual data
           i = i + 1
        Next
    
        B = WorksheetFunction.LinEst(Y, X)
    
    End Sub