I am running a panel data regression using the plm
package in R
and want to control for multicollinearity between the explanatory variables.
I know there is the vif()
function in the car
-package, however as far as I know, it cannot deal with panel data output.
The plm
can do other diagnostics such as a unit root test but I found no method to calculate for multicollinearity.
Is there a way to calculate a similar test to vif
, or can I just regard each variable as a time-series, leaving out the panel information and run tests using the car
package?
I cannot disclose the data, but the problem should be relevant to all panel data models.
The dimension is roughly 1,000 observations, over 50 time-periods.
The code I use looks like this:
pdata <- pdata.frame(RegData, index=c("id","time"))
fixed <- plm(Y~X, data=pdata, model="within")
and then
vif(fixed)
returns an error.
This question has been asked with reference to other statistical packages such as SAS https://communities.sas.com/thread/47675 and Stata http://www.stata.com/statalist/archive/2005-08/msg00018.html and the common answer has been to use pooled model to get VIF. The logic is that since multicollinearity is only about independent variable there is no need to control for individual effects using panel methods.
Here's some code extracted from another site:
mydata=read.csv("US Panel Data.csv")
attach(mydata) # not sure is that's really needed
Y=cbind(Return) # not sure what that is doing
pdata=pdata.frame(mydata, index=c("id","t"))
model=plm(Y ~ 1+ESG+Beta+Market.Cap+PTBV+Momentum+Dummy1+Dummy2+Dummy3+Dummy4+Dummy5+
Dummy6+Dummy7+Dummy8+Dummy9,
data=pdata,model="pooling")
vif(model)