What's the difference between using the "doParallel" package with type = MPI and using doMPI directly?
library(foreach)
library(doParallel)
cl <- makeCluster(mpi.universe.size(), type='MPI')
registerDoParallel(cl)
system.time(foreach(i = 1:3) %dopar% {Sys.sleep(i); i})
VS
library(doMPI)
cl <- startMPIcluster(count=2)
registerDoMPI(cl)
system.time(foreach(i = 1:3) %dopar% {Sys.sleep(i); i})
The "doParallel" package acts as a wrapper around the "clusterApplyLB" function which is implemented by calling functions from the "Rmpi" package when using an MPI cluster.
The "doMPI" package uses "Rmpi" functions directly and includes some features that aren't available in "clusterApplyLB":
supports fetching inputs and combining outputs on-the-fly to efficiently handle a large number of loop iterations;
supports MPI broadcast to initialize workers;
allows workers to be started either by mpirun or MPI spawn function.