In Stata, I've recently found that when I use the same variable across multiple interaction terms in one regression model, Stata flags that variable for collinearity. For instance, running:
regress dep i.gender##c.age i.ethnicity##c.age
Flags the following message:
note: age omitted because of collinearity
Age is still included in the subsequent regression table, but twice, the first time with a coefficient, SE, etc. as one would expect, but then a second time saying omitted
.
I've done similar analyses many times before, but never had this (or at least never noticed it). It's jarring, because it goes without saying that age is collinear with itself. But that shouldn't matter, because it's not as though there are two variables called 'age' which I'm trying to enter simultaneously. It's very clearly one variable which I'm using in two interaction terms. Has anyone else come across this, and do they know a way of suppressing it?
Yeah, stata does not parse the input to check if the variables are exactly the same but you can suppress the ommited due to multicolinearity variables using the noomitted
option, or by making sure to only include each variable once in the regression by using single #
for the interactions terms.
sysuse nlsw88
reg wage i.south##c.age i.union##c.age
reg wage i.south##c.age i.union#c.age i.union // only one each
reg wage age age age
reg wage age age age, noomitted // suppress output.
Another possible problem is due to using sparse data by constructing to many interactions terms and there is only one kind of observation for a generated dummy variable (either all "1"s or all "0"s). See the next example.
cls
reg wage i.south##age // runs fine
replace south = 1 if age == 46
reg wage i.south##age // note: 1.south#46.age omitted because of collinearity
reg wage i.south##c.age // runs fine
In any case, I am not sure it applies to your issue because you explicitly regress on continuous age
interacted with ethnicity and gender. I don't think this issue arises when using a continuous variable. Maybe?