Search code examples
rfunctionparallel-processinglapply

parLapply - How to solve error "Could not find function "bindToEnv""?


I want to use parLapply and I am setting up my code like it is introduced here: http://www.win-vector.com/blog/2016/01/parallel-computing-in-r/

The last few times it worked well. However, with my current parLapply call I am getting the error Error in checkForRemoteErrors(val) : 3 nodes produced errors; first error: could not find function "bindToEnv".

Here a short example:

#' Copy arguments into env and re-bind any function's lexical scope to bindTargetEnv .
#' 
#' See http://winvector.github.io/Parallel/PExample.html for example use.
#' 
#' 
#' Used to send data along with a function in situations such as parallel execution 
#' (when the global environment would not be available).  Typically called within 
#' a function that constructs the worker function to pass to the parallel processes
#' (so we have a nice lexical closure to work with).
#' 
#' @param bindTargetEnv environment to bind to
#' @param objNames additional names to lookup in parent environment and bind
#' @param names of functions to NOT rebind the lexical environments of
bindToEnv <- function(bindTargetEnv=parent.frame(),objNames,doNotRebind=c()) {
  # Bind the values into environment
  # and switch any functions to this environment!
  for(var in objNames) {
    val <- get(var,envir=parent.frame())
    if(is.function(val) && (!(var %in% doNotRebind))) {
      # replace function's lexical environment with our target (DANGEROUS)
      environment(val) <- bindTargetEnv
    }
    # assign object to target environment, only after any possible alteration
    assign(var,val,envir=bindTargetEnv)
  }
}

ccc <- 1

# Parallel
cl <- parallel::makeCluster(getOption("cl.cores", 3))
junk <- parallel::clusterEvalQ(cl, c(library(data.table)))

f <- function(x) {
  bindToEnv(objNames = 'ccc')

  return(x+x)  
}

b <- do.call(rbind, parallel::parLapply(cl, 1:10,  f))

If I don't add bindToEnv everything works fine. What am I doing wrong? Thanks!


Solution

  • You need to use clusterExport() to export used functions and objects you define before you create the clusters.

    library(parallel)
    cl <- makeCluster(getOption("cl.cores", 3))
    clusterEvalQ(cl, c(library(data.table)))
    clusterExport(cl, c("bindToEnv", "ccc"), 
                  envir=environment())
    f <- function(x) {
      bindToEnv(objNames='ccc')
      return(x+x)  
    }
    
    b <- do.call(rbind, parallel::parLapply(cl, 1:10,  f))
    b
    #        ,1]
    #  [1,]    2
    #  [2,]    4
    #  [3,]    6
    #  [4,]    8
    #  [5,]   10
    #  [6,]   12
    #  [7,]   14
    #  [8,]   16
    #  [9,]   18
    # [10,]   20
    
    stopCluster(cl)