I have a $N$x$N$ matrix $G$ composed of zeros and ones. I would like to resample only the ones in $G$ from a Bernoulli distribution where probability $p$ varies along the raws. The idea is to replace all ones in $G$ with realizations from independent Bernoulli variables. I have tried this in R :
for(i in 1:N){
for(j in 1:N){
G[i,j] <- ifelse(G[i,j] == 1,rbinom(1,1,p[i]),0)}}
However, for computation time, it would be better to avoid the use of for loops. Would anyone know how to write this in matrix form ? Thank you very much!
Exploit R
's recycling of vectors and its column-oriented storage. I'll explain after showing one solution.
#
# Generate a 0/1 matrix `G` for testing.
#
N <- 1e3
G <- matrix(rbinom(N^2, 1, 1/2), N)
#
# Generate the vector of probabilities for testing.
#
p <- runif(N)
#
# Here is a solution.
#
U <- runif(N*N) <= p # Random 0/1 values
G.0 <- G * U # Conditionally replace elements of G
The runif
function generates one uniform variate for each entry in G
. Comparing it to the vector p
creates N vectors of 0/1 values (1 for true, 0 for false), all strung together, with the expected frequency of ones in locations i, i+N, i+2N, ..., i+(N-1)N given by the value of p[i]
for i=1, 2, ..., N. When this is multiplied by G
in the last line, the new values are zeros when either the old value was zero or the new one is, which is the desired result because the values of G
internally are also laid out column by column.
This solution is about 20 times faster than the original double for loop. It avoids the slowness of conditional execution (ifelse
) and exploits any internal parallelism and optimization of R
's basic operations (runif
, <=
, and *
). You do pay a (relatively small) price: RAM to store these random results (here explicitly placed into array U
) is needed.