Basically, instrumentation inside the spgm()
function from the splm package doesn't seem to work, because it shows no difference in its estimation results from the base model (that is, the same formula but without instruments).
I've tried the following things:
method
to every possibility (w2sls
, b2sls
, g2sls
and ec2sls
)lag
and spatial.error
arguments, which are booleans)I can't seem to find why it's not working. Here's a reproducible example using a panel data data.frame from the plm
package:
library(splm)
library(plm)
sessionInfo()
## R version 4.0.5 (2021-03-31)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19042)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=Galician_Spain.1252 LC_CTYPE=Galician_Spain.1252 LC_MONETARY=Galician_Spain.1252 LC_NUMERIC=C
## [5] LC_TIME=Galician_Spain.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] knitr_1.31 plm_2.4-1 splm_1.5-2
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.6 bdsmatrix_1.3-4 lattice_0.20-41 deldir_0.2-10 class_7.3-18 zoo_1.8-8 gtools_3.8.2
## [8] assertthat_0.2.1 digest_0.6.27 lmtest_0.9-38 utf8_1.1.4 R6_2.5.0 evaluate_0.14 coda_0.19-4
## [15] e1071_1.7-4 spam_2.6-0 highr_0.8 pillar_1.5.1 Rdpack_2.1.1 miscTools_0.6-26 rlang_0.4.10
## [22] spdep_1.1-5 gdata_2.18.0 raster_3.4-5 gmodels_2.18.1 Matrix_1.3-2 splines_4.0.5 stringr_1.4.0
## [29] xfun_0.22 compiler_4.0.5 pkgconfig_2.0.3 maxLik_1.4-6 tidyselect_1.1.0 tibble_3.0.6 expm_0.999-6
## [36] codetools_0.2-18 fansi_0.4.2 crayon_1.4.1 dplyr_1.0.5 sf_0.9-7 MASS_7.3-53.1 rbibutils_2.0
## [43] spatialreg_1.1-5 grid_4.0.5 nlme_3.1-152 spData_0.3.8 lifecycle_1.0.0 DBI_1.1.1 magrittr_2.0.1
## [50] units_0.7-0 ibdreg_0.3.1 KernSmooth_2.23-18 stringi_1.5.3 pryr_0.1.4 LearnBayes_2.15.1 sp_1.4-5
## [57] ellipsis_0.3.1 generics_0.1.0 vctrs_0.3.6 boot_1.3-27 sandwich_3.0-0 Formula_1.2-4 tools_4.0.5
## [64] forcats_0.5.1 glue_1.4.2 markdown_1.1 purrr_0.3.4 parallel_4.0.5 classInt_0.4-3 dotCall64_1.0-1
data(Produc)
data(usaww)
head(Produc[c("gsp", "pcap", "pc", "hwy")])
## gsp pcap pc hwy
## 1 28418 15032.67 35793.80 7325.80
## 2 29375 15501.94 37299.91 7525.94
## 3 31303 15972.41 38670.30 7765.42
## 4 33430 16406.26 40084.01 7907.66
## 5 33749 16762.67 42057.31 8025.52
## 6 33604 17316.26 43971.71 8158.23
# Baseline model. No instruments
GM <- spgm(log(gsp) ~ log(pcap) + log(pc), data=Produc,
lag = TRUE,
spatial.error = FALSE,
method = "w2sls",
listw = usaww)
## Warning in if (model == "fixed" & !isTRUE(attr(terms(formula), "intercept"))) formula <- as.formula(paste(attr(terms(formula), : the condition has
## length > 1 and only the first element will be used
summary(GM)
##
## Call:spgm(formula = log(gsp) ~ log(pcap) + log(pc), data = Produc, listw = usaww, lag = TRUE, spatial.error = FALSE, method = "w2sls")
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.2054157 -0.0290282 0.0017208 0.0260133 0.2340702
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## lambda 0.367638 0.043093 8.5313 < 2e-16
## log(pcap) 0.087213 0.034721 2.5118 0.01201
## log(pc) 0.498388 0.037026 13.4603 < 2e-16
##
## Residual variance (sigma squared): 0.0023562, (sigma: 0.048541)
# Model with instruments (log(hwy) is being used to instrument log(pcap))
GM_instru <- spgm(log(gsp) ~ log(pcap) + log(pc), data=Produc,
lag = TRUE,
spatial.error = FALSE,
method = "w2sls",
instruments = ~log(hwy) + log(pc),
listw = usaww)
## Warning in if (model == "fixed" & !isTRUE(attr(terms(formula), "intercept"))) formula <- as.formula(paste(attr(terms(formula), : the condition has
## length > 1 and only the first element will be used
summary(GM_instru)
##
## Call:spgm(formula = log(gsp) ~ log(pcap) + log(pc), data = Produc, listw = usaww, lag = TRUE, spatial.error = FALSE, instruments = ~log(hwy) +
## log(pc), method = "w2sls")
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.2054157 -0.0290282 0.0017208 0.0260133 0.2340702
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## lambda 0.367638 0.043093 8.5313 < 2e-16
## log(pcap) 0.087213 0.034721 2.5118 0.01201
## log(pc) 0.498388 0.037026 13.4603 < 2e-16
##
## Residual variance (sigma squared): 0.0023562, (sigma: 0.048541)
Why does instrumentation not work?
Because you are not specifying the model properly. Try:
GM_instru <- spgm(log(gsp) ~ log(pc), data=Produc,
lag = TRUE,
spatial.error = FALSE, endog = ~log(pcap),
method = "w2sls",
instruments = ~log(hwy),
listw = usaww)
In fact, the endogenous variable has an argument and should not be specified in the formula object.