Search code examples
rspatial

splm package: Instruments in spgm() function don't seem to work


Basically, instrumentation inside the spgm() function from the splm package doesn't seem to work, because it shows no difference in its estimation results from the base model (that is, the same formula but without instruments).

I've tried the following things:

  • Changing the argument method to every possibility (w2sls, b2sls, g2sls and ec2sls)
  • Using a time-invariant variable as an instrument (in the Produc data.frame, that would be the variable 'region'), instead of a region-invariant one, as shown in the reproducible example afterwards.
  • Changing between SAR, SEM and SAC specifications (i.e. every possible combination of both lag and spatial.error arguments, which are booleans)

I can't seem to find why it's not working. Here's a reproducible example using a panel data data.frame from the plm package:

library(splm)
library(plm)

sessionInfo()
## R version 4.0.5 (2021-03-31)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19042)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=Galician_Spain.1252  LC_CTYPE=Galician_Spain.1252    LC_MONETARY=Galician_Spain.1252 LC_NUMERIC=C                   
## [5] LC_TIME=Galician_Spain.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] knitr_1.31 plm_2.4-1  splm_1.5-2
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.6         bdsmatrix_1.3-4    lattice_0.20-41    deldir_0.2-10      class_7.3-18       zoo_1.8-8          gtools_3.8.2      
##  [8] assertthat_0.2.1   digest_0.6.27      lmtest_0.9-38      utf8_1.1.4         R6_2.5.0           evaluate_0.14      coda_0.19-4       
## [15] e1071_1.7-4        spam_2.6-0         highr_0.8          pillar_1.5.1       Rdpack_2.1.1       miscTools_0.6-26   rlang_0.4.10      
## [22] spdep_1.1-5        gdata_2.18.0       raster_3.4-5       gmodels_2.18.1     Matrix_1.3-2       splines_4.0.5      stringr_1.4.0     
## [29] xfun_0.22          compiler_4.0.5     pkgconfig_2.0.3    maxLik_1.4-6       tidyselect_1.1.0   tibble_3.0.6       expm_0.999-6      
## [36] codetools_0.2-18   fansi_0.4.2        crayon_1.4.1       dplyr_1.0.5        sf_0.9-7           MASS_7.3-53.1      rbibutils_2.0     
## [43] spatialreg_1.1-5   grid_4.0.5         nlme_3.1-152       spData_0.3.8       lifecycle_1.0.0    DBI_1.1.1          magrittr_2.0.1    
## [50] units_0.7-0        ibdreg_0.3.1       KernSmooth_2.23-18 stringi_1.5.3      pryr_0.1.4         LearnBayes_2.15.1  sp_1.4-5          
## [57] ellipsis_0.3.1     generics_0.1.0     vctrs_0.3.6        boot_1.3-27        sandwich_3.0-0     Formula_1.2-4      tools_4.0.5       
## [64] forcats_0.5.1      glue_1.4.2         markdown_1.1       purrr_0.3.4        parallel_4.0.5     classInt_0.4-3     dotCall64_1.0-1
data(Produc) 
data(usaww)

head(Produc[c("gsp", "pcap", "pc", "hwy")])
##     gsp     pcap       pc     hwy
## 1 28418 15032.67 35793.80 7325.80
## 2 29375 15501.94 37299.91 7525.94
## 3 31303 15972.41 38670.30 7765.42
## 4 33430 16406.26 40084.01 7907.66
## 5 33749 16762.67 42057.31 8025.52
## 6 33604 17316.26 43971.71 8158.23
# Baseline model. No instruments
GM <- spgm(log(gsp) ~ log(pcap) + log(pc), data=Produc,
           lag = TRUE,
           spatial.error = FALSE,
           method = "w2sls",
           listw = usaww)
## Warning in if (model == "fixed" & !isTRUE(attr(terms(formula), "intercept"))) formula <- as.formula(paste(attr(terms(formula), : the condition has
## length > 1 and only the first element will be used
summary(GM)
## 
## Call:spgm(formula = log(gsp) ~ log(pcap) + log(pc), data = Produc,     listw = usaww, lag = TRUE, spatial.error = FALSE, method = "w2sls")
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.2054157 -0.0290282  0.0017208  0.0260133  0.2340702 
## 
## Coefficients: 
##           Estimate Std. Error t value Pr(>|t|)
## lambda    0.367638   0.043093  8.5313  < 2e-16
## log(pcap) 0.087213   0.034721  2.5118  0.01201
## log(pc)   0.498388   0.037026 13.4603  < 2e-16
## 
## Residual variance (sigma squared): 0.0023562, (sigma: 0.048541)
# Model with instruments (log(hwy) is being used to instrument log(pcap))
GM_instru <- spgm(log(gsp) ~ log(pcap) + log(pc), data=Produc,
                  lag = TRUE,
                  spatial.error = FALSE,
                  method = "w2sls",
                  instruments = ~log(hwy) + log(pc),
                  listw = usaww)
## Warning in if (model == "fixed" & !isTRUE(attr(terms(formula), "intercept"))) formula <- as.formula(paste(attr(terms(formula), : the condition has
## length > 1 and only the first element will be used
summary(GM_instru)
## 
## Call:spgm(formula = log(gsp) ~ log(pcap) + log(pc), data = Produc,     listw = usaww, lag = TRUE, spatial.error = FALSE, instruments = ~log(hwy) + 
##         log(pc), method = "w2sls")
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.2054157 -0.0290282  0.0017208  0.0260133  0.2340702 
## 
## Coefficients: 
##           Estimate Std. Error t value Pr(>|t|)
## lambda    0.367638   0.043093  8.5313  < 2e-16
## log(pcap) 0.087213   0.034721  2.5118  0.01201
## log(pc)   0.498388   0.037026 13.4603  < 2e-16
## 
## Residual variance (sigma squared): 0.0023562, (sigma: 0.048541)

Why does instrumentation not work?


Solution

  • Because you are not specifying the model properly. Try:

    GM_instru <- spgm(log(gsp) ~  log(pc), data=Produc,
                      lag = TRUE,
                      spatial.error = FALSE, endog = ~log(pcap),
                      method = "w2sls",
                      instruments = ~log(hwy),
                      listw = usaww)
    

    In fact, the endogenous variable has an argument and should not be specified in the formula object.