Search code examples
rsapply

Class of output object differs as input data differs


I am trying to draw a variable number of samples for each of n attempts. In this example n = 8 because length(n.obs) == 8. Once all of the samples have been drawn I want to combine them into a matrix.

Here is my first attempt:

set.seed(1234)
n.obs <- c(2,1,2,2,2,2,2,2)
my.samples <- sapply(1:8, function(x) sample(1:4, size=n.obs[x], prob=c(0.1,0.2,0.3,0.4), replace=TRUE))
my.samples

This approach produces a list.

class(my.samples)
#[1] "list"

I identify the number of columns needed in the output matrix using:

max.len <- max(sapply(my.samples, length))
max.len
#[1] 2

The output matrix can be created using:

 corrected.list <- lapply(my.samples, function(x) {c(x, rep(NA, max.len - length(x)))})
 output.matrix <- do.call(rbind, corrected.list)
 output.matrix[is.na(output.matrix)] <- 0
 output.matrix
 #     [,1] [,2]
 #[1,]    4    3 
 #[2,]    3    0
 #[3,]    3    2
 #[4,]    3    4
 #[5,]    4    3
 #[6,]    3    3
 #[7,]    3    4
 #[8,]    1    4

The above approach seems to work fine as along as n.obs includes multiple values and at least one element in n.obs > 1. However, I want the code to be flexible enough to handle each of the following n.obs:

The above sapply statement returns a 2 x 8 matrix with the following n.obs.

set.seed(1234)
n.obs <- c(2,2,2,2,2,2,2,2)

The above sapply statement returns an integer with the following n.obs.

set.seed(3333)
n.obs <- c(1,1,1,1,1,1,1,1)

The above sapply statement returns a list with the following n.obs.

n.obs <- c(0,0,0,0,0,0,0,0)

Here are example desired results for each of the above three n.obs:

desired.output <- matrix(c(4, 3,
                           3, 3,
                           2, 3,
                           4, 4,
                           3, 3,
                           3, 3,
                           4, 1,
                           4, 2), ncol = 2, byrow = TRUE)

desired.output <- matrix(c(2,
                           3,
                           4,
                           2,
                           3,
                           4,
                           4,
                           1), ncol = 1, byrow = TRUE)

desired.output <- matrix(c(0,
                           0,
                           0,
                           0,
                           0,
                           0,
                           0,
                           0), ncol = 1, byrow = TRUE)

How can I generalize the code so that it always returns a matrix with eight rows regardless of the n.obs used as input? One way would be to use a series of if statements to handle problematic cases, but I thought there might be a simpler and more efficient solution.


Solution

  • We can write a function :

    get_matrix <- function(n.obs) {
    
       nr <- length(n.obs)
       my.samples <- sapply(n.obs, function(x) 
                      sample(1:4, size=x, prob=c(0.1,0.2,0.3,0.4), replace=TRUE))
       max.len <- max(lengths(my.samples))
       mat <- matrix(c(sapply(my.samples, `[`, 1:max.len)), nrow = nr, byrow = TRUE)
       mat[is.na(mat)] <- 0
       mat
    }
    

    Checking output :

    get_matrix(c(2,1,2,2,2,2,2,2))
    
    #     [,1] [,2]
    #[1,]    1    4
    #[2,]    4    0
    #[3,]    4    3
    #[4,]    4    4
    #[5,]    4    2
    #[6,]    4    3
    #[7,]    4    4
    #[8,]    4    4
    
    get_matrix(c(1,1,1,1,1,1,1,1))
    
    #     [,1]
    #[1,]    4
    #[2,]    4
    #[3,]    3
    #[4,]    4
    #[5,]    2
    #[6,]    4
    #[7,]    1
    #[8,]    4
    
    get_matrix(c(0,0,0,0,0,0,0,0))
    #     [,1]
    #[1,]    0
    #[2,]    0
    #[3,]    0
    #[4,]    0
    #[5,]    0
    #[6,]    0
    #[7,]    0
    #[8,]    0