Search code examples
rforeachsnowrparallel

unrelated nested foreach with an outer %dopar% and an inner %do%


I am running tasks locally in parallel using %dopar% from the foreach package using the doSNOW package to create the cluster (running this on a windows machine at the moment). I have done this many times before and it works fine until I place an unrelated foreach loop using a %do% (i.e. non-parallel) inside of it. Then R gives me the error (with traceback) :

 Error in { : task 1 failed - "could not find function "%do%""  3 stop(simpleError(msg, call = expr))  2 e$fun(obj, substitute(ex), parent.frame(), e$data)  1 foreach(rc = 1:5) %dopar% {
    aRandomCounter = -1
    if (1 > 0) {
        for (batchi in 1:20) { ...

Here is some code that replicates the problem on my machine:

require(foreach)
require(doSNOW)
cl<-makeCluster(5) 
registerDoSNOW(cl)
for(stepi in 1:10)  # normal outer for
{
  foreach(rc=1:5) %dopar% # the time consuming stuff in parallel (not looking to actually retrieve any data)
  {
    aRandomCounter = -1
    if(1 > 0)
    {
      for(batchi in 1:20) 
      {
        anObjectIwantToCreate <- foreach( qrc = 1:100, .combine=c ) %do% 
        {
          return(runif(1)) # I know this is not efficient, it is a placeholder to reproduce the issue
        }
        aRandomCounter = aRandomCounter + sum(anObjectIwantToCreate > 0.5)
      } 
    }
    return(aRandomCounter)
  }
}
stopCluster(cl)

Replacing the inner foreach with a simple for or (l/s)apply is a solution. But is there a way to make this work with the inner foreach and why the error in the first place ?


Solution

  • Of course, I got it to work as soon as I posted it (sorry.. I will leave it in case someone else has the same issue). It is a scoping issue - I knew you had to load any external packages within the %dopar%, but what I did not realize is that that includes the foreach package itself. Here is the solution:

    require(foreach)
    require(doSNOW)
    cl<-makeCluster(5) 
    registerDoSNOW(cl)
    for(stepi in 1:10)  # normal outer for
    {
      foreach(rc=1:5) %dopar% # the time consuming stuff in parallel (not looking to actually retrieve any data)
      {
        require(foreach) ### the solution
        aRandomCounter = -1
        if(1 > 0) 
        {
          for(batchi in 1:20) 
          {
            anObjectIwantToCreate <- foreach( qrc = 1:100, .combine=c ) %do% 
            {
              return(runif(1))
            }
            aRandomCounter = aRandomCounter + sum(anObjectIwantToCreate > 0.5)
          } 
        }
        return(aRandomCounter)
      }
    }
    stopCluster(cl)