I had a look at how plm
(R package for panel models) implements the Breusch-Pagan test for random effects in plmtest()
and wonder if it can handle unbalanced panels.
For unbalanced panels, we need another version of the Breusch-Pagan test for random effects as is given by Baltagi/Li (1990): A lagrange multiplier test for the error components model with incomplete panels, Econometric Reviews, 9:1, 103-107, DOI: 10.1080/07474939008800180. As this paper is a bit hard to read, you can also look at how STATA does it: http://www.stata.com/manuals13/xtxtregpostestimation.pdf
EDIT The modified test allowing for unbalanced panels is now in the package on CRAN (since version 1.6-4).
Edit: the CRAN version of plm
from 1.6-4 on (December 2016) also features the unbalanced test statistics in plmtest()
.
Since this is resolved now, I will post an answer here.
The code is now in the development version v1.15-16 of plm
on r-forge:
https://r-forge.r-project.org/projects/plm/ and https://r-forge.r-project.org/R/?group_id=406
Here is how to replicate an example from Stata's documentation:
# get data set from Stata's webpage
# It is an unbalanced panel
require(haven) # required to read Stata data file
nlswork <- read_dta("http://www.stata-press.com/data/r14/nlswork.dta")
nlswork$race <- factor(nlswork$race) # fix data
nlswork$race2 <- factor(ifelse(nlswork$race == 2, 1, 0)) # need this variable for example
pnlswork <- pdata.frame(nlswork, index=c("idcode", "year"), drop.index=F)
# note Stata 14 uses by default a different method compared to plm's Swamy–Arora variance component estimator
# This is why in comparison with web examples from Stata the random effects coefficients slightly differ
plm_re_nlswork <- plm(ln_wage ~ grade + age + I(age^2) + ttl_exp + I(ttl_exp^2) + tenure + I(tenure^2) + race2 + not_smsa + south
, data = pnlswork, model = "random")
# resembles the FE estimation by Stata in Example 2 of http://www.stata.com/manuals13/xtxtreg.pdf
plm_fe_nlswork <- plm(ln_wage ~ grade + age + I(age^2) + ttl_exp + I(ttl_exp^2) + tenure + I(tenure^2) + race2 + not_smsa + south
, data = pnlswork, model = "within")
plm_pool_nlswork <- plm(ln_wage ~ grade + age + I(age^2) + ttl_exp + I(ttl_exp^2) + tenure + I(tenure^2) + race2 + not_smsa + south
, data = pnlswork, model = "pooling")
# Run Breusch-Pagan test with modification for unbalanced panels of Baltahi/Li (1990)
# resembles Example 1 in http://www.stata.com/manuals13/xtxtregpostestimation.pdf
plmtest(plm_pool_nlswork)
## Lagrange Multiplier Test - individual effects - Breusch-Pagan Test for unbalanced Panels as in Baltagi/Li (1990)
## data: ln_wage ~ grade + age + I(age^2) + ttl_exp + I(ttl_exp^2) + tenure + ...
## BP_unbalanced = 14779.98, df = 1, p-value < 0.00000000000000022
## alternative hypothesis: significant effects