I have a model which contains a time trend variable for 7 years (so 2000=1,2001=2,...,2006=7) as well as having dummy variables for 6 of the years (so one binary variable for each of the years excluding 2000). When I ask R to fit this linear model:
olsmodel=lm(lnyield ~ lnx1+ lnx2+ lnx3+ lnx4+ lnx5+ x6+ x7+ x8+ timetrend+
yeardummy2001+ yeardummy2002+ yeardummy2003+ yeardummy2004+
yeardummy2005+ yeardummy2006)
I get NA's produced for the last dummy variable in the model summary. Along with the following "Coefficients: (1 not defined because of singularities)".
I do not know why this is happening as all of the x_i variables are continuous and no subset of the dummies and the time trend are a linear combination of each other.
Any help as to why this might be happening would be much appreciated!
The problem is when you set the year trend to be 1:n
, and also include dummy variable for each year, it happens to produce a non-full-column-rank covariates matrix:
Say if there are only 3 categories: r1, r2, r3, the model is y ~ trend + c2 + c3
and the covariates matrix you will have is :
> mat
int trend c2 c3
[1,] 1 1 0 0
[2,] 1 1 0 0
[3,] 1 2 1 0
[4,] 1 2 1 0
[5,] 1 3 0 1
[6,] 1 3 0 1
and you can find the column rank of covariates matrix mat
is only 3 instead of the number of coefficients you need to estimate (4), i.e. t(mat)%*%mat
is singular. That might cause the error.