Search code examples
rstatistics-bootstrap

Custom bootstrap confidence intervals in R


I need to find a way for getting the bootstrap confidence intervals on the estimate I obtain with a custom function. Now, the problem is that I have one big matrix from which I take rows out at random and then calculate the quantity needed.

Here is (hopefully) reproducible example

Generate similar random data:

mat1 <- matrix(rnorm(300, 80, 20), nrow = 100)

Function to calculate the desired quantity (where R is correlation matrix):

IIvar <- function(R) { 
d <- eigen(R)$values  
p <- length(d)  
sum((d-1)^2)/(p*(p-1))}

My function that attempts the solution (where omat is the smaller matrix that consists of some mat1 rows, freq is the number of rows in omat, and numR is the number of replicates):

ciint <- function(omat, mat1, freq, numR) {
II <- IIvar(cor(omat))
n <- dim(mat1)[1]
b <- numeric(numR)
for (i in 1:numR) { b[i] <- IIvar(cor(mat1[sample(c(1:n),freq),]))}
hist(b)
abline(v = II, lty = 5, lwd = 3)
return(b) }

The resultant vector b has all the values obtained from matrices of randomly chosen rows (number determined by freq) from mat1 which can be compared with IIvar from omat (the matrix with rows chosen by population membership).

In mat1 I have i.e. 5 populations (grouping by rows) and I need to calculate IIvar for all of them separetely and to generate confidence intervals for the obtained value.

When I run my ciint function like this

ciint(omat, mat1, 61, 1000)

I get the distribution of values, and the position of the "real" IIvar value, but I have no idea how to generate 95% intervals from this point.


Solution

  • All you need is an interval that contains 95% of the generated b values. You can the highest posterior density from bayesian estimation, it's just that. There are many packages that compute it, for instance, function emp.hpd from TeachingDemos. Add

    require(TeachingDemos)
    

    and change the last line (return(b)) from ciint to

    emp.hpd(b)
    

    (No need to use return().)