I would like to generate 1000 samples of size 25 from a standard normal distribution, calculate the variance of each one, and create a histogram. I have the following:
samples = replicate(1000, rnorm(25,0,1), simplify=FALSE)
hist(sapply(samples, var))
Then I would like to randomly select one sample from those 1000 samples and take 1000 bootstraps from that sample. Then calculate the variance of each and plot a histogram. So far, I have:
sub.sample = sample(samples, 1)
Then this is where I'm stuck, I know a for loop is needed for bootstrapping here so I have:
rep.boot2 <- numeric(lengths(sub.sample))
for (i in 1:lengths(sub.sample)) {
index2 <- sample(1:1000, size = 25, replace = TRUE)
a.boot <- sub.sample[index2, ]
rep.boot2[i] <- var(a.boot)[1, 2]
}
but running the above produces an "incorrect number of dimensions" error. Which part is causing the error?
I can see 2 problems here. One is that you are trying to subset sub.sample with as you would with a vector but it is actually a list of length 1.
a.boot <- sub.sample[index2, ]
To fix this, you can change
sub.sample = sample(samples, 1)
to
sub.sample = as.vector(unlist(sample(samples, 1)))
The second problem is that you are generating a sample of 25 indexes from between 1 and 1000
index2 <- sample(1:1000, size = 25, replace = TRUE)
but then you try to extract these indexes from a list with a length of only 25. So you will end up with mostly NA values in a.boot.
If I understand what you want to do correctly then this should work:
samples = replicate(1000, rnorm(25,0,1), simplify=FALSE)
hist(sapply(samples, var))
sub.sample = as.vector(unlist(sample(samples, 1)))
rep.boot2=list()
for (i in 1:1000) {
index2 <- sample(1:25, size = 25, replace = TRUE)
a.boot <- sub.sample[index2]
rep.boot2[i] <- var(a.boot)
}