Search code examples
rconfidence-intervalstatistics-bootstrap

Bootstrap confidence interval based on a single observation


Can we use bootstrap CI for calculating 95% confidence interval for a single observation x where 0<=x<=100?

> n=1
> x=98
> mean_est=mean(x)
> nboot <- 2000
> resample_dist <- rep(NA, length = nboot)
> for (i in 1:nboot) {
+   resample_i <- sample(x, size = n, replace = TRUE)
+   resample_dist[[i]] <- mean(resample_i)
+ }
> b_lci <- quantile(resample_dist, probs = 0.025)
> b_uci <- quantile(resample_dist, probs = 0.975)
> 
> 
> sprintf("Bootstrapped: %.3f [%.3f, %.3f]", mean_est, b_lci, b_uci)
[1] "Bootstrapped: 98.000 [3.000, 96.000]"

Solution

  • No this does not work that way.

    Let's start with the code. If you look at the documentation of sample()it reads:

    If x has length 1, is numeric (in the sense of is.numeric) and x >= 1, sampling via sample takes place from 1:x

    So your code resample_i <- sample(x, size = n, replace = TRUE) for x = 98 actually samples 1 number between 1 and 98.

    Ok so this is why you get a result, but it is totally different from what you want to do.

    Now conceptually, in order to bootstrap a statistic of interest we require a sample with different values. If you only have one observation your resampling will always only contain the same number, calculating any additional statistics on these does not make sense.

    Think of it like an urn problem: Let's say you have an urn with 10 red and 10 green balls.

    You take one observation (red). Now if you calculate the mean (which does not make sense on a sample of one) of this one observation you would get "red". Resampling your one observation also does not help as your resamples would always be c("red","red",...,"red") and of course so would be the mean of each resample.