Search code examples
rregressionanovastatistics-bootstrapquantreg

Performing Anova on Bootstrapped Estimates from Quantile Regression


So I'm using the quantreg package in R to conduct quantile regression analyses to test how the effects of my predictors vary across the distribution of my outcome.

FML <- as.formula(outcome ~ VAR + c1 + c2 + c3)
quantiles <- c(0.25, 0.5, 0.75)

q.Result <- list()
for (i in quantiles){
    i.no <- which(quantiles==i)
    q.Result[[i.no]] <- rq(FML, tau=i, data, method="fn", na.action=na.omit)
}

Then i call anova.rq which runs a Wald test on all the models and outputs a pvalue for each covariate telling me whether the effects of each covariate vary significantly across the distribution of my outcome.

anova.Result <- anova(q.Result[[1]], q.Result[[2]], q.Result[[3]], joint=FALSE)

Thats works just fine. However, for my particular data (and in general?), bootstrapping my estimates and their error is preferable. Which i conduct with a slight modification of the code above.

q.Result <- rqs(FML, tau=quantiles, data, method="fn", na.action=na.omit)
q.Summary <- summary(Q.mod, se="boot", R=10000, bsmethod="mcmb",
                            covariance=TRUE)

Here's where i get stuck. The quantreg currently cannot peform the anova (Wald) test on boostrapped estimates. The information files on the quantreg packages specifically states that "extensions of the methods to be used in anova.rq should be made" regarding the boostrapping method.

Looking at the details of the anova.rq method. I can see that it requires 2 components not present in the quantile model when bootstrapping.

1) Hinv (Inverse Hessian Matrix). The package information files specifically states "note that for se = "boot" there is no way to split the estimated covariance matrix into its sandwich constituent parts."

2) J which, according to the information files, is "Unscaled Outer product of gradient matrix returned if cov=TRUE and se != "iid". The Huber sandwich is cov = tau (1-tau) Hinv %*% J %*% Hinv. as for the Hinv component, there is no J component when se == "boot". (Note that to make the Huber sandwich you need to add the tau (1-tau) mayonnaise yourself.)"

Can i calculate or estimate Hinv and J from the bootstrapped estimates? If not what is the best way to proceed?

Any help on this much appreciated. This my first timing posting a question here, though I've greatly benefited from the answers to other peoples questions in the past.


Solution

  • Consulted with a colleague, and he confirmed that it was unlikely that Hinv and J could be 'reverse' computed from bootstrapped estimates. However we resolved that estimates from different taus could be compared using Wald test as follows.

    From object rqs produced by

    q.Summary <- summary(Q.mod, se="boot", R=10000, bsmethod="mcmb", covariance=TRUE)
    

    you extract the bootstrapped Beta values for variable of interest in this case VAR, the first covariate in FML for each tau

    boot.Bs <- sapply(q.Summary, function (x) x[["B"]][,2])
    B0 <- coef(summary(lm(FML, data)))[2,1] # Extract liner estimate data linear estimate
    

    Then compute wald statistic and get pvalue with number of quantiles for degrees of freedom

    Wald <- sum(apply(boot.Bs, 2, function (x) ((mean(x)-B0)^2)/var(x)))
    Pvalue <- pchisq(Wald, ncol(boot.Bs), lower=FALSE)
    

    You also want to verify that bootstrapped Betas are normally distributed, and if you're running many taus it can be cumbersome to check all those QQ plots so just sum them by row

    qqnorm(apply(boot.Bs, 1, sum))
    qqline(apply(boot.Bs, 1, sum), col = 2)
    

    This seems to be working, and if anyone can think of anything wrong with my solution, please share