Fitting a polynomial

I want to calculate the fitted value for a polynomial.

I would like to calculate 'fit' but without adjusting for the variable 'z'.

What I have now is cumbersome, and eventually I would like to iterate through different power polynomials without having to add one more term to the above equation for 'fit' every time.


x <- runif(n = 50, min = 1, max = 10)
y <- runif(n = 50, min = 10, max = 20)
z <- sample(letters[1:5], 50, TRUE)

f <- lm(y ~ poly(x, 5, raw=TRUE) + as.factor(z), na.action=na.exclude)

fit <- f$coeff[1] + f$coeff[2]*x + f$coeff[3]*x^2 + f$coeff[4]*x^3 + 
  f$coeff[5]*x^4 + 
  f$coeff[6]*x^5

Solution

By default, the parametrization R uses for fitting a factor is to add a dummy variable for all levels except the first. So if you want the prediction for your data to exclude the coefficients for the factor z, just ask for predictions where x takes on the true values, and z takes on its first level (probably "a" in your case, but sampling might never give an "a", so it's better to use levels(as.factor(z))[1].

That is:

newdata <- data.frame(x = x, z = levels(as.factor(z))[1])
fit <- predict(f, newdata = newdata)

I'd be slightly worried that something might go wrong in computing as.factor() in newdata (though it appears to give the same values as your formula), so I'd recommend a slightly different overall approach: change z to a factor before calling lm(). That is,

z <- as.factor(z)
f <- lm(y ~ poly(x, 5, raw=TRUE) + z, na.action=na.exclude)
levs <- levels(z)
newdata <- data.frame(x = x, z = factor(levs[1], levels = levs))
fit <- predict(f, newdata = newdata)