Search code examples
rstatistics-bootstrap

bootstrap for different length of values


I would like to bootstrap my data to get a confidence intervals around a mean for a vector different lengths. For example with the code below I calculate y based on my vector x and then apply bootstrap to get CI.

set.seed(001)

x <- rnorm(length(iris$Sepal.Length),iris$Sepal.Length,0.05*iris$Sepal.Length)

mydata<-data.frame(x=x,y=pi*x^2)

library(boot)
myboot<-boot(mydata$y, function(u,i) mean(u[i]), R = 999)

boot.ci(myboot, type = c("perc"))

My question is how can I compute the bootstrapped mean CI for different size of x like for 3-4, 4-5, 5-6, 6-7, 7-8?


Solution

  • How about this:

    set.seed(001)
    x <- rnorm(length(iris$Sepal.Length),iris$Sepal.Length,0.05*iris$Sepal.Length)
    
    mydata<-data.frame(x=x,y=pi*x^2) 
    mydata$x_cut <- cut(x, breaks=c(3,5,6,7,9))
    
    boot_fun <- function(data, inds){
      tmp <- data[inds, ]
      tapply(tmp$y, tmp$x_cut, mean)
    }
    library(boot)
    myboot<-boot(mydata, boot_fun, R = 999, strata=mydata$x_cut)
    
    boot.ci(myboot, type = c("perc"))
    cis <- sapply(1:4, function(i)boot.ci(myboot, type="perc", index=i)$percent)
    colnames(cis) <- names(myboot$t0)
    cis <- cis[4:5, ]
    rownames(cis) <- c("Lower", "Upper")
    cis <- t(cis)
    
    cis
    #           Lower     Upper
    # (3,5]  67.79231  72.81593
    # (5,6]  92.25999  97.25919
    # (6,7] 124.65315 130.88061
    # (7,9] 167.58324 183.88702
    

    for BCa intervals, use:

    boot.ci(myboot, type = c("bca"))
    cis <- sapply(1:4, function(i)boot.ci(myboot, type="bca", index=i)$bca)
    colnames(cis) <- names(myboot$t0)
    cis <- cis[4:5, ]
    rownames(cis) <- c("Lower", "Upper")
    cis <- t(cis)