I have the following code that uses a nested loop. This uses sample random data and sample (smallish, relative to the actual application) numbers N
, Taumax
, and Tmax
.
N <- 10000
Taumax <- 50
Tmax <- 100
set.seed(42)
X1 <- matrix(rnorm(N * (Tmax+1)), N)
X2 <- matrix(rnorm(N * (Tmax+1)), N)
X3 <- matrix(rnorm(N * (Tmax+1)), N)
Phi <- matrix(rnorm(Taumax * (Tmax+1)), Taumax)
Psi <- matrix(rnorm(Taumax * 3), Taumax)
y <- array(0.0, dim = c(N, Taumax, (Tmax + 1)))
for (t in 1:(Tmax + 1)) {
for (tau in 1:Taumax) {
y[, tau, t] <-
exp(-(
Phi[tau, t] + Psi[tau, 1] * X1[, t] +
Psi[tau, 2] * X2[, t] +
Psi[tau, 3] * X3[, t]
) / tau) - 1
}
}
I'm curious whether it is possible to significantly speed up the loop bit (ie, the last bit) of the code by using, eg, apply
, mapply
or some other sort of vectorization (or matrix algebra, for that matter). I have used apply
before, but I'm far from using it proficiently.
One thing that could spell trouble is that one of the loop indexes (tau
) features as a factor in the ultimate calculation.
(There's also this thing with X1
, X2
, X3
, which together can also be an array X
. But is there a way that that would also speed up things?)
You could try replacing the inner loop with the outer()
function and this could provide a speed up.
Today, the "for" loops are not significantly slower than the apply functions, so there may not be a benefit.
N <- 10000
Taumax <- 50
Tmax <- 100
set.seed(42)
X1 <- matrix(rnorm(N * (Tmax+1)), N)
X2 <- matrix(rnorm(N * (Tmax+1)), N)
X3 <- matrix(rnorm(N * (Tmax+1)), N)
Phi <- matrix(rnorm(Taumax * (Tmax+1)), Taumax)
Psi <- matrix(rnorm(Taumax * 3), Taumax)
y <- array(0.0, dim = c(N, Taumax, (Tmax + 1)))
for (t in 1:(Tmax + 1)) {
for (tau in 1:Taumax) {
y[, tau, t] <-
exp(-(
Phi[tau, t] + Psi[tau, 1] * X1[, t] +
Psi[tau, 2] * X2[, t] +
Psi[tau, 3] * X3[, t]
) / tau) - 1
}
}
#Replace the inner loop with the outer function: %o%
z <- array(0.0, dim = c(N, Taumax, (Tmax + 1)))
for (t in 1:(Tmax + 1)) {
z[,,t] <- t(exp(-(Phi[,t] + Psi[, 1] %o% X1[,t] + Psi[,2] %o% X2[,t] + Psi[,3] %o% X3[,t]) /c(1:Taumax)) -1)
}
#compare results
identical(y, z)
Update Using the microbenchmark package, there doesn't seem to be much of an improvement:
Unit: seconds
expr min lq mean median uq max neval
Org() 2.349934 2.416134 2.462019 2.437054 2.515235 2.641641 100
new() 1.988689 2.053746 2.093798 2.094842 2.127008 2.250958 100
Maybe try rewriting the function using Rcpp.