I have an unbalanced panel of 108 countries over a period of 28 years, and I am trying to estimate a model with Panel-Corrected Standard Errors. But my attempts are failing because I get the following error message:
Error in pcse(lm, groupN = data$id, groupT = data$time, pairwise = TRUE): Length of groupN and groupT must equal nrows of using data.
My dataset looks roughtly like this:
library(plm)
data(Grunfeld)
setDT(Grunfeld)[firm %in%c(1,4,7,9) & year>=1950,inv:=NA] # creating unbalanced data
head(Grunfeld,20)
# firm year inv value capital
# 1: 1 1935 317.6 3078.5 2.8
# 2: 1 1936 391.8 4661.7 52.6
# 3: 1 1937 410.6 5387.1 156.9
# 4: 1 1938 257.7 2792.2 209.2
# 5: 1 1939 330.8 4313.2 203.4
# ....
# 15: 1 1949 555.1 3700.2 1020.1
# 16: 1 1950 NA 3755.6 1099.0
# 17: 1 1951 NA 4833.0 1207.7
# 18: 1 1952 NA 4924.9 1430.5
# 19: 1 1953 NA 6241.7 1777.3
# 20: 1 1954 NA 5593.6 2226.3
Whereby, for some firms, I have missing values on my dependent variable (inv) for a few of the last years (1950-54).
To calculate my case I am first estimating the linear model. I am using lags for theoretical reasons.
lm<- lm(inv ~ lag(value,k=1)+ lag(capital, k = 1) + as.factor(year) + as.factor(firm), data = Grunfeld)
summary(lm)
And then I try to add my panel corrected standard errors, however when I run the command the error message appears.
lm.pcse <- pcse(lm, groupN=Grunfeld$firm, groupT=Grunfeld$year,
pairwise=TRUE)
#Error in pcse(lm, groupN = Grunfeld$firm, groupT = Grunfeld$year, #pairwise = TRUE) :
# Length of groupN and groupT must equal nrows of using data.
Does anyone knowe how I can go about this issue?
thanks a lot for your help
I've never been able to get this package to work outside of the included demo--but I have solved this problem before (only to encounter a new one!).
Your error is likely caused by the fact that the "using data" does not include the observations omitted by lm()
but your groupN and T vectors do (as they are extracted from the full datatable, with missing data and all).
What I've done in the past is run the model, extract the "using data" with model.frame()
then use that new dataframe to run lm()
and pcse()
. Something like the following:
lm <- lm(inv ~ lag(value,k=1)+ lag(capital, k = 1) + as.factor(year) + as.factor(firm), data = Grunfeld)
dfPCSE <- model.frame(lm)
lm <- lm(inv ~ lag(value,k=1)+ lag(capital, k = 1) + as.factor(year) + as.factor(firm), data = dfPCSE)
lm.pcse <- pcse(lm, groupN=dfPCSE$firm, groupT=dfPCSE$year,
pairwise=TRUE)