Can anybody help me to optimize my way of sampling random values for each row of a matrix?
Thus far I have two ways of doing it:
volume <- matrix(rnorm(10000), 100, 100)
microbenchmark(
test <- apply(volume, 1, function(x) {sample(x, 1)}),
# make matrix into a vector, sample one time for all rows at the same time,
# and then add 0, 100, 200 ... to each sample
test2 <- c(t(volume))[sample(x=1:ncol(volume), size=nrow(volume), replace=T) + seq(from=0, by=ncol(volume), length.out=nrow(volume))]
)
You can index a matrix with a 2 column matrix so that the first column is the selector of the row and the second one is the selector for column.
For example:
index <- matrix(c(1, 4, 1, 3), ncol = 2)
index
[,1] [,2]
[1,] 1 1
[2,] 4 3
matrix(c(1:15), ncol = 3)
[,1] [,2] [,3]
[1,] 1 6 11
[2,] 2 7 12
[3,] 3 8 13
[4,] 4 9 14
[5,] 5 10 15
matrix(c(1:15), ncol = 3)[index]
[1] 1 14
Applying the same principle we can create a matrix of indexes with first column for each row (1:100) and second column with a random integer from 1 to 100 to select a random column.
test3 <- volume[matrix(c(1:100, sample.int(100, 100, replace = TRUE)), nrow = 100)]
It seems to be a little faster, probably because of vectorisation in R but I'm not strong enough to tell. I hope this helped you or gave you others ideas!