I have five points for which I will fit a 2nd order polynomial in R. I will do so by minimizing the sum of squared errors of prediction (SSE).
What's the best way to do so?
So far I have done this:
(1,5.8), (2,3.9), (3,4.2), (4,5.7), (5,10.2) ## my data
To this data I want to fit a 2nd order polonium with the intercept 10 and the coefficient before x^2 is set to 1. I do this:
p<-c(5.8, 3.9, 4.2, 5.7, 10.2)
f<-function(x,a){x^2-ax+}
##sum of squared errors of prediction
SSE<-function(a){ (p[i]-f(i,a) )^2 }
nlm(SSE,0.1) ## 0.1 is picked totally random
But it returns a "wrong" answer:
$minimum
[1] 9.475923e-25
$estimate
[1] 4.96
When I manually calculate SSE for a=4.96 then is SSE=9.475923e-25. What do I do wrong? My biggest concern: does nlm function run through all my data points?
I'm not sure what you mean by "wrong". There are a few glitches in your code, and I get a slightly different answer (a
almost exactly = 5).
p <- c(5.8, 3.9, 4.2, 5.7, 10.2)
x <- 1:5
f <- function(x,a){x^2-a*x+10}
##sum of squared errors of prediction
SSE <- function(a){ sum((p-f(x,a))^2) }
(n1 <- nlm(SSE,0.1)) ## 0.1 is picked totally random
## $minimum
## [1] 0.22
## $estimate
## [1] 4.999998
## $gradient
## [1] 1.443291e-10
## $code
## [1] 1
## $iterations
## [1] 3
This seems like a reasonable fit, although it doesn't match the data points exactly.
par(las=1,bty="l") ## cosmetic
plot(x,p,xlim=c(0,5.5))
xvec <- seq(0,6,length.out=100)
lines(xvec,f(xvec,n1$estimate))
You could also fit this model via
ypar <- p-x^2-10 ## subtract known terms from LHS
m1 <- lm(ypar~x-1) ## fit linear term (-1 suppresses intercept)
-1*coef(m1) ## flip sign
which gets a result of exactly 5.