I'm running on Linux and used mclapply
easily. I run into some errors with parlapply
, even after using clusterEvalQ
.
Before I go further to resolve the issue, is there any point, i.e. could there be a significant speed difference between the two or do people just use parLapply
when on Windows?
I've read about parLapplyLB
and can see the uses of this approach, but if I'm strictly looking at mclapply
and parlapply
does the FORK approach and PSOCK approach vary much in speed?
The nature of my function may determine the answer; it is using stri_extract
.
Some quick benchmarks suggest that mclapply
could be slightly faster, but this probably depends on the specific system and problem. The more balanced the jobs and the slower the actual tasks the less it should matter, which function you use.
library(parallel)
library(microbenchmark)
microbenchmark(
parLapply = {cl <- makeCluster(2)
parLapply(cl, rep(1:7, 3), function(x) {set.seed(1); rnorm(10^x)})
stopCluster(cl)},
mclapply = {mclapply(rep(1:7 , 3), function(x) {set.seed(1); rnorm(10^x)}, mc.cores = 2)},
times = 10
)
#Unit: seconds
# expr min lq mean median uq max neval
#parLapply 1.85548 2.04397 3.332970 3.071284 4.323514 6.294364 10
#mclapply 1.62610 1.65288 2.217407 1.849594 2.243418 5.435189 10
microbenchmark(
parLapply = {cl <- makeCluster(2)
parLapply(cl, rep(6, 20), function(x) {set.seed(1); rnorm(10^x)})
stopCluster(cl)},
mclapply = {mclapply(rep(6, 20), function(x) {set.seed(1); rnorm(10^x)}, mc.cores = 2)},
times = 10
)
#Unit: milliseconds
# expr min lq mean median uq max neval
#parLapply 1150.657 1188.9750 1705.1364 1242.739 2071.276 3785.516 10
# mclapply 820.692 932.2262 994.4404 1000.402 1079.930 1117.863 10
sessionInfo()
#R version 3.3.1 (2016-06-21)
#Platform: x86_64-pc-linux-gnu (64-bit)
#Running under: Ubuntu 14.04.5 LTS
#
#locale:
# [1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8 LC_COLLATE=de_DE.UTF-8
# [5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8 LC_PAPER=de_DE.UTF-8 LC_NAME=C
# [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
#
#attached base packages:
#[1] parallel stats graphics grDevices utils datasets methods base
#
#other attached packages:
#[1] microbenchmark_1.4-2.1 doParallel_1.0.10 iterators_1.0.8 foreach_1.4.3
#
#loaded via a namespace (and not attached):
# [1] colorspace_1.2-6 scales_0.4.0 plyr_1.8.4 tools_3.3.1 gtable_0.2.0 Rcpp_0.12.4
# [7] ggplot2_2.1.0 codetools_0.2-14 grid_3.3.1 munsell_0.4.3