Search code examples
rliststatistics-bootstrap

Create a matrix from a list consisting of unequal matrices for individual bootstraps


I tried to create a matrix from a list which consists of N unequal matrices... The reason to do this is to make R individual bootstrap samples. In the example below you can find e.g. 2 companies, where we have 1 with 10 & 1 with just 5 observations.

Data:

set.seed(7)
Time <- c(10,5)

xv <- matrix(c(rnorm(10,5,2), rnorm(5,20,1), rnorm(10,5,2), rnorm(5,20,1)), ncol=2);
y <- matrix( c(rnorm(10,5,2), rnorm(5,20,1))); 
z <- matrix(c(rnorm(10,5,2), rnorm(5,20,1), rnorm(10,5,2), rnorm(5,20,1)), ncol=2)

# create data frame of input variables which helps
# to conduct the rowise bootstrapping 
data <- data.frame (y = y, xv = xv, z = z); 
rows <- dim(data)[1]; 
cols <- dim(data)[2]; 

# create the index to sample from the different panels 
cumTime <- c(0, cumsum (Time)); 
index <- findInterval (seq (1:rows), cumTime, left.open = TRUE); 

# draw R individual bootstrap samples 
bootList <- replicate(R = 5, list(), simplify=F); 
bootList <- lapply (bootList, function(x) by (data, INDICES = index, FUN = function(x) dplyr::sample_n (tbl = x, size = dim(x)[1], replace = T))); 

---------- UNLISTING ---------

Currently, I try do it incorrectly like this: Example for just 1 entry of the list:

matrix(unlist(bootList[[1]], recursive = T), ncol = cols)

The desired output is just

bootList[[1]]

as a matrix.

Do you have an idea how to do this & if possible reasonably efficient?

The matrices are then processed in unfortunately slow MLE estimations...


Solution

  • i found a solution for you. From what i gather, you have a Dataframe containing all observations of all companies, which may have different panel lengths. And as a result you would like to have a Bootstap sample for each company of same size as the original panel length. You mearly have to add a company indicator

    data$company = c(rep(1, 10), rep(2, 5)) # this could even be a factor.
    L1 = split(data, data$company)
    L2 = lapply(L1, FUN = function(s) s[sample(x = 1:nrow(s), size = nrow(s), replace = TRUE),] ) 
    

    stop here if you would like to have saperate bootstap samples e.g. in case you want to estimate seperately

    bootdata = do.call(rbind, L2)
    

    Best wishes,

    Tim