Search code examples
rlmhigher-order-functionsquadratic

R: Equivalent ways of coding a formula of a lm with higher-order terms


To my knowledge, there are three possible ways to code for second-order (and higher-) terms in a formula.

We can use the function I(..), the function poly(..) and we can construct ourself the variable of the second degree. My question is: How do these functions work?

set.seed(23)
A = rnorm(12)
B = 1:12
C = factor(rep(c(1,2,3),4))
B2=B^2

what is the equivalent of lm(A~poly(B,2)*C) when using I(..) or when using the variable B2?

The use of raw=T in the poly(..) function does not change anything to the results, correct?


Solution

  • lm(A~B2*C)
    

    or

    lm(A~I(B^2)*C)
    

    give you the result of squaring column B and then doing the regression. Using

    poly(B,2)
    

    does something completely different - see ?poly.

    Edit to add: poly() calculates orthogonal polynomials which are not the same as the standard polynomials derived from simply squaring, cubing etc. a number.