Search code examples
rforeachparallel-processingdoparalleldompi

R package containing foreach will work with doParallel but not doMPI, can not find object


I'm trying to write an R package which contains several nested functions, within a foreach statement and doMPI backend. It is throwing a "cannot find "XXX" object error. The strange thing is that this error does not occur if I use doParallel as the backend. This is an example of the problem but I could use a working solution, with doMPI for much bigger problems.

This is the code that has been compiled into the R-Package using RStudio, roxygen, devtools etc.

#' Test function level 1
#' @param var11 first variable for function 1
#' @param var12 second variable for function 1
#' @param var13 third variable for function 1
#' @export fun1

fun1 <- function (fun2.params, fun3.params, var11, var12, var13, ...) {

    results <- data.frame (foreach::`%dopar%`(
               foreach::`%:%`(foreach::foreach(j = 1:var11, .combine = cbind),
               foreach::foreach (i = 1:var12, .combine=rbind)),
               {
                   out3 <- replicate(var13,
                                     do.call(fun2,
                                             c(list(fun3.params=fun3.params),
                                               fun2.params)))
                   output2 <- data.frame(mean(out3))
        }
    )
)
    ## save outputs for subsequent analyses if required
saveRDS(results, file = paste("./outputs/", var13 ,"_", var12, "_", var11, "_",
                              format(Sys.time(), "%d_%m_%Y"), ".rds", sep=""))
}

#' Test function level 2
#' @param var21 first variable for function 2
#' @param var22 second variable for function 2
#' @export fun2

fun2 <- function (fun3.params, var21, var22, ...) {
    out2 <- `if` (rpois(1, var21) > 0, var22 * do.call(fun3, fun3.params), 0)
}

#' Test function level 3
#' @param var31 first variable for function 3
#' @param var32 second variable for function 3
#' @param var33 third variable for function 3
#' @export fun3

fun3 <- function (var31, var32, var33, ...) {
    out3 <- var31 * rnorm(1, mean=var32, sd= var33)
}

I then load the library and call the top level function from an .R file using emacs ESS (or from RStudio editor) and these commands

library(toymod)
library(doParallel)
cl <-makeCluster(10)
registerDoParallel(cl)

fun1.params <- list(var11=10, var12=150, var13=365)
fun2.params <- list(var21=0.05,var22=9.876)
fun3.params <- list(var31=1.396,var32=14.387,var33=3.219)

do.call(fun1, c(list(fun2.params = fun2.params,
                     fun3.params = fun3.params),
                fun1.params))

When I run it using doParallel as the parallel backend it works fine, however when I run it using doMPI, I get the following error

Error in { : task 12 failed - "object 'fun2' not found"

This is running on Ubuntu 16.04 Linux, using R 3.4.1, doMPI 0.2.2, and doParallel. I've put the whole package on github at https://github.com/jamaas/toymod.git

Could someone tell me if I need to change the code for doMPI? It seems to be related to producing the R package.


Solution

  • I believe the problem is that you need to use the foreach .packages='toymod' option. This is because the body of the foreach loop isn't actually part of the 'toymod' package, and therefore you need to load 'toymod' like you would to access functions from any other R package.

    I don't know why this isn't necessary when using doParallel. I guess doParallel must automatically load the package that the foreach loop is in. I'll look into this some more, and perhaps modify doMPI to do the same.