Search code examples
rrcppsnow

Using Rcpp within parallel code via snow to make a cluster


I've written a function in Rcpp and compiled it with inline. Now, I want to run it in parallel on different cores, but I'm getting a strange error. Here's a minimal example, where the function funCPP1 can be compiled and runs well by itself, but cannot be called by snow's clusterCall function. The function runs well as a single process, but gives the following error when ran in parallel:

Error in checkForRemoteErrors(lapply(cl, recvResult)) : 
  2 nodes produced errors; first error: NULL value passed as symbol address

And here is some code:

## Load and compile
library(inline)
library(Rcpp)
library(snow)
src1 <- '
     Rcpp::NumericMatrix xbem(xbe);
     int nrows = xbem.nrow();
     Rcpp::NumericVector gv(g);
     for (int i = 1; i < nrows; i++) {
      xbem(i,_) = xbem(i-1,_) * gv[0] + xbem(i,_);
     }
     return xbem;
'
funCPP1 <- cxxfunction(signature(xbe = "numeric", g="numeric"),body = src1, plugin="Rcpp")

## Single process
A <- matrix(rnorm(400), 20,20)
funCPP1(A, 0.5)

## Parallel
cl <- makeCluster(2, type = "SOCK") 
clusterExport(cl, 'funCPP1') 
clusterCall(cl, funCPP1, A, 0.5)

Solution

  • Think it through -- what does inline do? It creates a C/C++ function for you, then compiles and links it into a dynamically-loadable shared library. Where does that one sit? In R's temp directory.

    So you tried the right thing by shipping the R frontend calling that shared library to the other process (which has another temp directory !!), but that does not get the dll / so file there.

    Hence the advice is to create a local package, install it and have both snow processes load and call it.

    (And as always: better quality answers may be had on the rcpp-devel list which is read by more Rcpp constributors than SO is.)