Search code examples
rlistmatrixdata-manipulationdata-cleaning

Sample rows of a list of matrices based on numbers from another list


I have a list of matrices from which I want to randomly draw rows based on the numbers of another list. Here is the list of matrices:

x <- list(`1` = matrix(1:20, nrow=10), `2` = matrix(1:20, nrow=10))

Here is the list of numbers

y <- list(`1` = 2, `2` = 3) #for `1` I want to draw 2 rows and for `2` I want to draw 3 rows

The final list will look like this:

$`1`
      [,1] [,2]
 [1,]    1   11
 [2,]    6   16

$`1`
      [,1] [,2]
 [1,]    1   11
 [2,]    7   17 
 [3,]    9   19

How to achieve this in base R? Thanks for any help!


Solution

  • We can use Map in base R - loop over the corresponding list elements of 'x' and 'y' sample the rows of matrixes in 'x' based on the values in 'y'

    Map(function(u, v) u[sample(seq_len(nrow(u)), v),], x, y)
    $`1`
         [,1] [,2]
    [1,]    9   19
    [2,]    6   16
    
    $`2`
         [,1] [,2]
    [1,]    3   13
    [2,]    8   18
    [3,]    5   15
    

    Or use map2 from purrr

    library(purrr)
    map2(x, y,  ~ .x[sample(seq_len(nrow(.x)), .y), ])
    

    If we convert to tibble, then slice_sample can be used as well

    library(dplyr)
    library(tibble)
    map2(x, y,  ~ .x %>% 
       as.data.frame %>%
       as_tibble %>% 
       slice_sample(n = .y))
    $`1`
    # A tibble: 2 × 2
         V1    V2
      <int> <int>
    1     4    14
    2     7    17
    
    $`2`
    # A tibble: 3 × 2
         V1    V2
      <int> <int>
    1     8    18
    2     6    16
    3     9    19