Search code examples
rstatisticslmrobust

Why is lm_robust() HC3 standard error smaller than coeftest() HC0 standard error?


I am using lm_robust of package 'estimatr' for a fixed effect model including HC3 robust standard errors. I had to switch from vcovHC(), because my data sample was just to large to be handled by it.

using following line for the regression:

lm_robust(log(SPREAD) ~ PERIOD, data = dat, fixed_effects = ~ STOCKS + TIME, se_type = "HC3")

The code runs fine, and the coefficients are the same as using fixed effects from package plm. Since I can not use coeftest to estimate HC3 standard errors with the plm output due to a too large data sample, I compared the HC3 estimator of lm_robustwith the HC1 of coeftest(model, vcov= vcovHC(model, type = HC1)) As result the HC3 standarderror of lm_robust is much smaller than HC1 from coeftest.

Does somebody has an explanation, since HC3 should be more restrictive than HC1. I appreciate any recommendations and solutions.

EDIT model used for coeftest:

plm(log(SPREAD) ~ PERIOD, data = dat, index = c("STOCKS", "TIME"), effect = "twoway", method = "within")


Solution

  • It appears that the vcovHC() method for plm automatically estimates cluster-robust standard errors, while for lm_robust(), it does not. Therefore, the HC1 estimation of the standard error for plm will appear inflated compared to lm_robust (of lm for that matter).

    Using some toy data:

    library(sandwich)
    library(tidyverse)
    library(plm)
    library(estimatr)
    library(lmtest)
    
    set.seed(1981)
    x <- sin(1:1000)
    y <- 1 + x + rnorm(1000)
    f <- as.character(sort(rep(sample(1:100), 10)))
    t <- as.character(rep(sort(sample(1:10)), 100))
    
    dat <- tibble(y = y, x = x, f = f, t = t)
    
    lm_fit <- lm(y ~ x + f + t, data = dat)
    plm_fit <- plm(y ~ x, index = c("f", "t"), model = "within", effect = "twoways", data = dat)
    rb_fit <- lm_robust(y ~ x, fixed_effects = ~ f + t, data = dat, se_type = "HC1", return_vcov = TRUE)
    
    sqrt(vcovHC(lm_fit, type = "HC1")[2, 2])
    #> [1] 0.04752337
    sqrt(vcovHC(plm_fit, type = "HC1"))
    #>            x
    #> x 0.05036414
    #> attr(,"cluster")
    #> [1] "group"
    sqrt(rb_fit$vcov)
    #>            x
    #> x 0.04752337
    
    rb_fit <- lm_robust(y ~ x, fixed_effects = ~ f + t, data = dat, se_type = "HC3", return_vcov = TRUE)
    sqrt(vcovHC(lm_fit, type = "HC3")[2, 2])
    #> [1] 0.05041177
    sqrt(vcovHC(plm_fit, type = "HC3"))
    #>            x
    #> x 0.05042142
    #> attr(,"cluster")
    #> [1] "group"
    sqrt(rb_fit$vcov)
    #>            x
    #> x 0.05041177
    

    There does not appear to be equivalent cluster-robust standard error types in the two packages. However, the SEs get closer when specifying cluster-robust SEs in lm_robust():

    rb_fit <- lm_robust(y ~ x, fixed_effects = ~ f + t, clusters = f, data = dat, se_type = "CR0")
    summary(rb_fit)
    #> 
    #> Call:
    #> lm_robust(formula = y ~ x, data = dat, clusters = f, fixed_effects = ~f + 
    #>     t, se_type = "CR0")
    #> 
    #> Standard error type:  CR0 
    #> 
    #> Coefficients:
    #>   Estimate Std. Error t value  Pr(>|t|) CI Lower CI Upper DF
    #> x    0.925    0.05034   18.38 1.133e-33   0.8251    1.025 99
    #> 
    #> Multiple R-squared:  0.3664 ,    Adjusted R-squared:  0.2888
    #> Multiple R-squared (proj. model):  0.3101 ,  Adjusted R-squared (proj. model):  0.2256 
    #> F-statistic (proj. model): 337.7 on 1 and 99 DF,  p-value: < 2.2e-16
    coeftest(plm_fit, vcov. = vcovHC(plm_fit, type = "HC1"))
    #> 
    #> t test of coefficients:
    #> 
    #>   Estimate Std. Error t value  Pr(>|t|)    
    #> x 0.925009   0.050364  18.366 < 2.2e-16 ***
    #> ---
    #> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    

    Created on 2020-04-16 by the reprex package (v0.3.0)