I want to calculate the fitted value for a polynomial.
I would like to calculate 'fit' but without adjusting for the variable 'z'.
What I have now is cumbersome, and eventually I would like to iterate through different power polynomials without having to add one more term to the above equation for 'fit' every time.
x <- runif(n = 50, min = 1, max = 10)
y <- runif(n = 50, min = 10, max = 20)
z <- sample(letters[1:5], 50, TRUE)
f <- lm(y ~ poly(x, 5, raw=TRUE) + as.factor(z), na.action=na.exclude)
fit <- f$coeff[1] + f$coeff[2]*x + f$coeff[3]*x^2 + f$coeff[4]*x^3 +
f$coeff[5]*x^4 +
f$coeff[6]*x^5
By default, the parametrization R uses for fitting a factor is to add a dummy variable for all levels except the first. So if you want the prediction for your data to exclude the coefficients for the factor z
, just ask for predictions where x
takes on the true values, and z
takes on its first level (probably "a"
in your case, but sampling might never give an "a"
, so it's better to use levels(as.factor(z))[1]
.
That is:
newdata <- data.frame(x = x, z = levels(as.factor(z))[1])
fit <- predict(f, newdata = newdata)
I'd be slightly worried that something might go wrong in computing as.factor()
in newdata
(though it appears to give the same values as your formula), so I'd recommend a slightly different overall approach: change z
to a factor before
calling lm()
. That is,
z <- as.factor(z)
f <- lm(y ~ poly(x, 5, raw=TRUE) + z, na.action=na.exclude)
levs <- levels(z)
newdata <- data.frame(x = x, z = factor(levs[1], levels = levs))
fit <- predict(f, newdata = newdata)