Search code examples
rlinear-regression

Why does 1st degree independent variable in poly(x,2) does not generate its estimate in the multiple regression?


My regression model has two kinds of polynomial terms, but one of its first degree variable (here age) fails to generate its estimate while exper and exper^2 terms do not fail. I want to understand why.

My code is:

library(wooldridge)
data('card')

fit1_ols <- lm(lwage~educ + poly(exper, 2) + black + smsa + south + poly(age, 2),  data = card)
fit2_ols <- lm(lwage~educ + poly(exper, 2) + black + smsa + south + poly(age, 2) + motheduc + fatheduc, data = card)

summ(fit1_ols)
summ(fit2_ols)

The outcomes are respectively:

----------------------------------------------------
                         Est.   S.E.   t val.      p
--------------------- ------- ------ -------- ------
(Intercept)              5.26   0.05   107.40   0.00
educ                     0.07   0.00    21.10   0.00
poly(exper, 2)1          8.93   0.49    18.05   0.00
poly(exper, 2)2         -2.71   0.41    -6.66   0.00
black                   -0.19   0.02   -10.75   0.00
smsa                     0.16   0.02    10.36   0.00
south                   -0.12   0.02    -8.23   0.00
poly(age, 2)1            ???                           
poly(age, 2)2            0.17   0.41     0.41   0.68
----------------------------------------------------

----------------------------------------------------
                         Est.   S.E.   t val.      p
--------------------- ------- ------ -------- ------
(Intercept)              5.18   0.06    87.10   0.00
educ                     0.07   0.00    16.99   0.00
poly(exper, 2)1          9.45   0.60    15.65   0.00
poly(exper, 2)2         -2.89   0.51    -5.67   0.00
black                   -0.16   0.02    -6.69   0.00
smsa                     0.16   0.02     8.74   0.00
south                   -0.11   0.02    -6.22   0.00
poly(age, 2)1            ???                           
poly(age, 2)2            0.20   0.49     0.41   0.69
motheduc                 0.01   0.00     2.24   0.03
fatheduc                -0.00   0.00    -0.25   0.80
----------------------------------------------------

Solution

  • If you use the normal summary instead of summ (for which you did load the respective library), R tells you the problem:

    library(wooldridge)
    data('card')
    
    fit1_ols <- lm(lwage~educ + poly(exper, 2) + black + smsa + south + poly(age, 2),  data = card)
    fit2_ols <- lm(lwage~educ + poly(exper, 2) + black + smsa + south + poly(age, 2) + motheduc + fatheduc, data = card)
    
    summary(fit1_ols)
    summary(fit2_ols)
    

    fit1_ols fit2_ols

    As you can see, the error message indicates "not defined because of singularities." Here is a helpful post on how to deal with this issue. Basically, the missing variable is probably collinear with something else, suggesting that you need to change the specification of your model.