I have a function which returns a (numVals x N) array/matrix. This function needs to be evaluated K times. My goal is to store all results in a multidimensional array containing doubles with shape c(numVals, N, K).
I'm having trouble finding appropriate arguments for .combine (or other parameters of foreach) so that its return value is in the correct format. I realize I could just go ahead and reshape a 2d returned by foreach later, but I am kind of running into memory limitations (and I'm not sure I can reshape in-place without any memory overhead.
The solution I'm looking for can be either foreach (or similar function compatible with dopar) outputting a 3d or reshaping into correct format without having to create another object with memory footprint as large as results.
Here's a code snippet:
library(doParallel)
library(doRNG)
registerDoParallel(cores = 3)
registerDoRNG(12345)
run_tasks <- function(k, N, numVals) {
return(matrix(runif(numVals * N), numVals, N))
}
K <- 10000
N <- 40
numVals <- 10
# Run the simulation
results <-
foreach(k = 1:K, .combine = rbind) %dorng% run_tasks(k, N, numVals)
# Desired output format
# results <- array(NA, c(numVals, N, K))
By default, foreach
returns the results in a list, so don't use .combine
at all and make an array
afterwards.
> results <- foreach(k = 1:K) %dorng% run_tasks(k, N, numVals)
> A <- array(unlist(results), dim=c(numVals, N, K))
> dim(A)
[1] 10 40 10000
> all.equal(A[,,1], results[[1]])
[1] TRUE
Not a frequent foreach
user, but I think .combine
is just a convenience argument, and instead of .combine=rbind
you could also use do.call('rbind', result)
. The combining is done after the multithreaded process, so using .combine
should not have significant speed gains.