Search code examples
rlinear-regressiondummy-variable

saturated regression with interaction variables


I'm working with the following data in R. I need to make a saturated regression with ed76 as a dependent variable. From what I understand, a saturated regression has to include all of the explanatory variables in addition to the interaction between dummy variables. So, say I have the following column variables nearc2, nearc4, momdad14, step14, ed76, south66, wage, iq. It is my understanding that the regression should look like this:Reg <- lm(ed76 ~ nearc2 + nearc4 + momdad14 + step14 + ed76 + south66 + wage + iq + nearc2*nearc4 + nearc2*momdad14 + nearc2*step14 + ... +) is there a more efficient way to create the interaction terms with all of one's dummy variables for the purpose of making a saturated regression model?


Solution

  • A saturated model requires as many parameters as data points. See e.g., this answer. So this is likely not what you want as saturated a not commonly used in linear models AFAIK. At least, I am not sure what you would use it for. However, the model can be fit with

    lm(ed76 ~ as.factor(seq_along(ed76)))
    

    @G. Grothendieck answer will only give you a saturated model with lm if it leads to as many parameters as there are observations.