r parallel-processing azure-machine-learning-service microsoft-r

Parallel *apply in Azure Machine Learning Studio

I have just started to get myself acquainted with parallelism in R.

As I am planning to use Microsoft Azure Machine Learning Studio for my project, I have started investigating what Microsoft R Open offers for parallelism, and thus, I found this, in which it says that parallelism is done under the hood that leverages the benefit of all available cores, without changing the R code. The article also shows some performance benchmarks, however, most of them demonstrate the performance benefit in doing mathematical operations.

This was good so far. In addition, I am also interested to know whether it also parallelize the *apply functions under the hood or not. I also found these 2 articles that describes how to parallelize *apply functions in general:

Quick guide to parallel R with snow: describes facilitating parallelism using snow package, par*apply function family, and clusterExport.
A gentle introduction to parallel computing in R: using parallel package, par*apply function family, and binding values to environment.

So my question is when I will be using *apply functions in Microsoft Azure Machine Learning Studio, will that be parallelized under the hood by default, or I need to make use of packages like parallel, snow etc.?

Solution

Personally, I think we could have marketed MRO a bit differently, without making such a big deal about parallelism/multithreading. Ah well.

R comes with an Rblas.dll/.so which implements the routines used for linear algebra computations. These routines are used in various places, but one common use case is for fitting regression models. With MRO, we replace the standard Rblas with one that uses the Intel Math Kernel Library. When you call a function like lm or glm, MRO will use multiple threads and optimized CPU instructions to fit the model, which can get you dramatic speedups over the standard implementation.

MRO isn't the only way you can get this sort of speedup; you can also compile/download other BLAS implementations that are similarly optimized. We just make it an easy one-step download.

Note that the MKL only affects code that involves linear algebra. It isn't a general-purpose speedup tool; any R code that doesn't do matrix computations won't see a performance improvement. In particular, it won't speed up any code that involves explicit parallelism, such as code using the parallel package, SNOW, or other cluster computing tools.

On the other hand, it won't degrade them either. You can still use packages like parallel, SNOW, etc to create compute clusters and distribute your code across multiple processes. MRO works just like regular CRAN R in this respect. (One thing you might want to do, though, if you're creating a cluster of nodes on the one machine, is reduce the number of MKL threads. Otherwise you risk contention between the nodes for CPU cores, which will degrade performance.)

Disclosure: I work for Microsoft.