With a list of unique objects, where the identities matter (and thus order... but only for the purpose of tracking identity):
fakeDataList <- list(one = 1,
two = 2,
three = 3,
four = 4)
There's a function that performs pairwise calculations...
fakeInnerFxn <- function(l){
x <- l[[1]]
y <- l[[2]]
low <- x + y - 1
mid <- x + y
high <- x + y + 1
out <- c(low, mid, high)
return(out)
}
... and returns three values per pair of ids
fakeInnerFxn(fakeDataList[c(1,2)])
#> [1] 2 3 4
The inner function is nested within the outer function which performs each pairwise operation on the full list...
fakeOuterFxn <- function(d){
n <- length(d)
out <- array(0, dim = c(n,n,3))
colnames(out) <- names(d)
rownames(out) <- names(d)
for(i in 1:n){
for(j in (i+1):n){
if (j <= n) {
out[i, j, ] <- fakeInnerFxn(d[c(i, j)])
}
}
}
diag(out[,,1]) <- 0 # not sure how to do this succinctly
diag(out[,,2]) <- 0
diag(out[,,3]) <- 0
return(out)
}
... and returns an array of three matrices representing the 'low', 'mid' and 'high'
fakeOuterFxn(fakeDataList)
#> , , 1
#>
#> one two three four
#> one 0 2 3 4
#> two 0 0 4 5
#> three 0 0 0 6
#> four 0 0 0 0
#>
#> , , 2
#>
#> one two three four
#> one 0 3 4 5
#> two 0 0 5 6
#> three 0 0 0 7
#> four 0 0 0 0
#>
#> , , 3
#>
#> one two three four
#> one 0 4 5 6
#> two 0 0 6 7
#> three 0 0 0 8
#> four 0 0 0 0
The actual data is a very long list and the calculations are slow.
How can I parallelize this code with foreach and doParallel in such a way that the array is preserved and the row/column orders are preserved (or at least able to be kept track of and re-ordered at the end)?
library(foreach)
library(doParallel)
#> Loading required package: iterators
#> Loading required package: parallel
registerDoParallel(detectCores()-2)
The for loop doesn't need to be inside a function, but it'd be neat if it was.
d <- fakeDataList
n <- length(d)
This is really as far as I've been able to get with it:
out <- foreach(i=1:n, .combine = 'c') %:%
foreach(j=(i+1):n, .combine = 'c') %dopar% {
if (j <= n) {
fakeInnerFxn(d[c(i, j)])
}
}
The answers are all here, but how do i get an array back?
out
#> [1] 2 3 4 3 4 5 4 5 6 4 5 6 5 6 7 6 7 8 7 8 9
Created on 2021-06-22 by the reprex package (v1.0.0)
You can always return the indices with the results and reconstruct your array later.
res <- foreach(i=1:(n-1), .combine = 'c') %:%
foreach(j=(i+1):n) %dopar% {
list(i, j, fakeInnerFxn(d[c(i, j)]))
}
n <- length(d)
out <- array(0, dim = c(n, n, 3))
for (res_k in res) out[res_k[[1]], res_k[[2]], ] <- res_k[[3]]