Search code examples
rstatisticshierarchical-dataconfidence-intervalstatistics-bootstrap

Confidence Interval from hierarchical bootstrap


I would like to calculate a BCa confidence interval for multi-stage bootstrap using boot.ci(). Here is an example from: Non-parametric bootstrapping on the highest level of clustered data using boot() function from {boot} in R which uses the boot command.

# creating example df
rho <- 0.4
dat <- expand.grid(
  trial=factor(1:5),
  subject=factor(1:3)
)
sig <- rho * tcrossprod(model.matrix(~ 0 + subject, dat))
diag(sig) <- 1
set.seed(17); dat$value <- chol(sig) %*% rnorm(15, 0, 1)

# function for resampling
resamp.mean <- function(dat, 
                    indices, 
                    cluster = c('subject', 'trial'), 
                    replace = TRUE){
  cls <- sample(unique(dat[[cluster[1]]]), replace=replace)
  sub <- lapply(cls, function(b) subset(dat, dat[[cluster[1]]]==b))
  sub <- do.call(rbind, sub)
  mean(sub$value)
} 

dat.boot <- boot(dat, resamp.mean, 4) # produces and estimated statistic

boot.ci(data.boot) # produces errors

How can I use boot.ci on the boot output?


Solution

  • You have used too few bootstrap resamples. When you call boot.ci, influence measures are needed, and if not provided they are obtained from empinf, which may fail with too few observations. See here for an explanation along similar lines.

    Try

    dat.boot <- boot(dat, resamp.mean, 1000) 
    boot.ci(dat.boot, type = "bca") 
    

    which gives:

    > boot.ci(dat.boot, type = "bca") 
    
    BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
    Based on 1000 bootstrap replicates
    
    CALL : 
    boot.ci(boot.out = dat.boot, type = "bca")
    
    Intervals : 
    Level       BCa          
    95%   (-0.2894,  1.2979 )  
    Calculations and Intervals on Original Scale
    Some BCa intervals may be unstable
    

    As an alternative, you can provide L (the influence measures) yourself.

    # proof of concept, use appropriate value for L!
    > dat.boot <- boot(dat, resamp.mean, 4)
    > boot.ci(dat.boot, type = "bca", L = 0.2)
    BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
    Based on 4 bootstrap replicates
    
    CALL : 
    boot.ci(boot.out = dat.boot, type = "bca", L = 0.2)
    
    Intervals : 
    Level       BCa          
    95%   ( 0.1322,  1.2979 )  
    Calculations and Intervals on Original Scale
    Warning : BCa Intervals used Extreme Quantiles
    Some BCa intervals may be unstable