Using the cars data in R, I would like to create a polynomial regression model with varying degree values.
Through research online I have found two methods of creating these models:
library(tidyverse)
data(cars)
# method 1
polyreg_deg2 <- lm(dist ~ speed+I(speed^2), data=cars)
# method 2
polyreg_deg2_again <- lm(dist ~ poly(speed, degree = 2), data = cars)
coefficients(polyreg_deg2)
coefficients(polyreg_deg2_again)
Below is the output of the coefficients:
# > coefficients(polyreg_deg2)
# (Intercept) speed I(speed^2)
# 2.4701378 0.9132876 0.0999593
# > coefficients(polyreg_deg2_again)
# (Intercept) poly(speed, degree = 2)1 poly(speed, degree = 2)2
# 42.98000 145.55226 22.99576
I'm under the impression that both methods of code should return the same model.
Please can someone explain why the intercept and coefficients are shown to be different?
Or perhaps point out where my code has been written incorrectly?
Any help is appreciated :)
PS. I'm still learning how to use R for stats, so apologies for my ignorance.
See the help page ?poly
. You need the raw = TRUE
argument.
raw: if true, use raw and not orthogonal polynomials.
coefficients(lm(dist ~ speed+I(speed^2), data=cars))
(Intercept) speed I(speed^2)
2.4701378 0.9132876 0.0999593
# poly raw = TRUE
coefficients(lm(dist ~ poly(speed, degree = 2, raw = TRUE), data = cars))
(Intercept) poly(speed, degree = 2, raw = TRUE)1 poly(speed, degree = 2, raw = TRUE)2
2.4701378 0.9132876 0.0999593
# poly raw = FALSE (default)
coefficients(lm(dist ~ poly(speed, degree = 2), data = cars))
(Intercept) poly(speed, degree = 2)1 poly(speed, degree = 2)2
42.98000 145.55226 22.99576