I am trying to draw a variable number of samples for each of n
attempts. In this example n = 8
because length(n.obs) == 8
. Once all of the samples have been drawn I want to combine them into a matrix
.
Here is my first attempt:
set.seed(1234)
n.obs <- c(2,1,2,2,2,2,2,2)
my.samples <- sapply(1:8, function(x) sample(1:4, size=n.obs[x], prob=c(0.1,0.2,0.3,0.4), replace=TRUE))
my.samples
This approach produces a list
.
class(my.samples)
#[1] "list"
I identify the number of columns needed in the output matrix
using:
max.len <- max(sapply(my.samples, length))
max.len
#[1] 2
The output matrix
can be created using:
corrected.list <- lapply(my.samples, function(x) {c(x, rep(NA, max.len - length(x)))})
output.matrix <- do.call(rbind, corrected.list)
output.matrix[is.na(output.matrix)] <- 0
output.matrix
# [,1] [,2]
#[1,] 4 3
#[2,] 3 0
#[3,] 3 2
#[4,] 3 4
#[5,] 4 3
#[6,] 3 3
#[7,] 3 4
#[8,] 1 4
The above approach seems to work fine as along as n.obs
includes multiple values and at least one element
in n.obs > 1
. However, I want the code to be flexible enough to handle each of the following n.obs
:
The above sapply
statement returns a 2 x 8 matrix
with the following n.obs
.
set.seed(1234)
n.obs <- c(2,2,2,2,2,2,2,2)
The above sapply
statement returns an integer
with the following n.obs
.
set.seed(3333)
n.obs <- c(1,1,1,1,1,1,1,1)
The above sapply
statement returns a list
with the following n.obs
.
n.obs <- c(0,0,0,0,0,0,0,0)
Here are example desired results for each of the above three n.obs
:
desired.output <- matrix(c(4, 3,
3, 3,
2, 3,
4, 4,
3, 3,
3, 3,
4, 1,
4, 2), ncol = 2, byrow = TRUE)
desired.output <- matrix(c(2,
3,
4,
2,
3,
4,
4,
1), ncol = 1, byrow = TRUE)
desired.output <- matrix(c(0,
0,
0,
0,
0,
0,
0,
0), ncol = 1, byrow = TRUE)
How can I generalize the code so that it always returns a matrix
with eight rows regardless of the n.obs
used as input? One way would be to use a series of if
statements to handle problematic cases, but I thought there might be a simpler and more efficient solution.
We can write a function :
get_matrix <- function(n.obs) {
nr <- length(n.obs)
my.samples <- sapply(n.obs, function(x)
sample(1:4, size=x, prob=c(0.1,0.2,0.3,0.4), replace=TRUE))
max.len <- max(lengths(my.samples))
mat <- matrix(c(sapply(my.samples, `[`, 1:max.len)), nrow = nr, byrow = TRUE)
mat[is.na(mat)] <- 0
mat
}
Checking output :
get_matrix(c(2,1,2,2,2,2,2,2))
# [,1] [,2]
#[1,] 1 4
#[2,] 4 0
#[3,] 4 3
#[4,] 4 4
#[5,] 4 2
#[6,] 4 3
#[7,] 4 4
#[8,] 4 4
get_matrix(c(1,1,1,1,1,1,1,1))
# [,1]
#[1,] 4
#[2,] 4
#[3,] 3
#[4,] 4
#[5,] 2
#[6,] 4
#[7,] 1
#[8,] 4
get_matrix(c(0,0,0,0,0,0,0,0))
# [,1]
#[1,] 0
#[2,] 0
#[3,] 0
#[4,] 0
#[5,] 0
#[6,] 0
#[7,] 0
#[8,] 0