Search code examples
rdoparallel

Running makeCluster within withTimeout does not work


The makeCluster function sometimes hangs in my code and just needs to be rerun to fix the issue.

In order to do that in my code I'm trying to use a while loop plus the withTimeout function to have the makeCluster function timeout if it is hanging and rerun itself.

The issue is that when I create my cluster with withTimeout I can't access it later for my parallel lapply.

library(parallel)
library(R.utils)
library(pbapply)


 cl = NULL

while( is.null(cl) ){

cl =  withTimeout({makeCluster(4,type = 'FORK')},timeout=3,
                         onTimeout="silent",envir = environment())
}

pblapply(1:3, function(x){x+1},cl = cl)

The error message I'm getting is:

Error in serialize(data, node$con, xdr = FALSE) : error writing to connection


Solution

  • When the envir argument of withTimeout() is not set, the clusters are created in .GlobalEnv and everything works as expected.

    library(parallel)
    library(R.utils)
    library(pbapply)
    cl <- NULL
    while(is.null(cl)){
        cl <- withTimeout(makeCluster(4, type='FORK'), timeout=3, onTimeout="silent")
    }
    pblapply(1:3, function(x){x+1}, cl = cl)
      |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s  
    [[1]]
    [1] 2
    
    [[2]]
    [1] 3
    
    [[3]]
    [1] 4
    

    In the question you set envir = environment(), which will resolve to the calling environment of the function. The latter is different from .GlobalEnv and seems unsuitable for creating clusters. See also R eval(): changed behavior when argument 'envir' is explicitly set to default value for more information on environments.