Search code examples
rfor-looprandomlapplysample

For Loop over Lapply (or using map2)


Hello I am trying to loop over a list of dataframes and pull samples of differing size from the dataframes. For example for df1, I want a sample of size 10, df2 a sample of size 8, etc. I have worked with Melissa Key on my previously asked question to develop the following code:

sampler <- function(df, n,...) {
  return(df[sample(x=nrow(df),n),])
}

#for(i in 1:totalstratum){
sample_list<-lapply(population_list,sampler,n=stratum_sizes[i,1])
#}

#or
library(purrr)
sample_list<-map2(population_list, stratum_sizes,sampler)

where stratum_sizes is a vector {4,5,3,2,10,10,8} and totalstratum=nrow(stratum_sizes), which is also equal to the number of elements in the list population_list.

So far, I am able to get a sample, but never with the correct number of observations. Any ideas? Thank you in advance for any help!


Solution

  • I assume you'd like to sample a certain number of rows from data.frames stored in a list.

    How about the following using map2:

    # Generate sample data
    # Here: A list of three data.frames
    set.seed(2017);
    lst <- lapply(1:3, function(x) data.frame(val1 = runif(20), val2 = runif(20)))
    
    # Define the sample sizes for every data.frame in the list
    ssize <- c(10, 5, 3)
    
    # Sample ssize entries from every data.frame in the list
    map2(ssize, lst, ~ .y[sample(nrow(.y), .x), ])
    #[[1]]
    #         val1      val2
    #16 0.38868193 0.6500038
    #8  0.43490560 0.3191046
    #11 0.67433148 0.8838444
    #7  0.03932234 0.6204450
    #2  0.53717641 0.3798674
    #3  0.46919565 0.9420740
    #19 0.94099988 0.1771317
    #5  0.77008816 0.2276118
    #10 0.27383312 0.2608393
    #14 0.43207779 0.2117630
    #
    #[[2]]
    #        val1      val2
    #12 0.8835366 0.6904628
    #4  0.0791699 0.7512366
    #6  0.5096950 0.4699963
    #19 0.5393251 0.4123170
    #20 0.9229542 0.9327490
    #
    #[[3]]
    #        val1      val2
    #4  0.9204118 0.1926415
    #15 0.8373573 0.9309950
    #8  0.1653395 0.5895154