Search code examples
rrcpprcppparallel

Exporting RcppParallel::RVector vs std::vector


Consider the following serial example function:

// [[Rcpp::plugins(cpp20)]]
#include <Rcpp.h>

// [[Rcpp::export]]
Rcpp::NumericVector example_fun(int n) {
  Rcpp::NumericVector result (n);
  for(int i = 0; i < n; ++i) {
    result[i] = something();
  }
  return result;
}

Parallelizing this loop with OpenMP requires the use of e.g. std::vector or RcppParallel::RVector, as Rcpp vectors are not thread-safe. The corresponding parallel std::vector version is

// [[Rcpp::plugins(cpp20)]]
// [[Rcpp::plugins(openmp)]]
#include <Rcpp.h>
#include <omp.h>
#include <vector>

// [[Rcpp::export]]
std::vector<double> example_fun(int n) {
  std::vector<double> result (n);
  #pragma omp parallel for
  for(int i = 0; i < n; ++i) {
    result[i] = something();
  }
  return result;
}

and similarly with RcppParallel::RVector<double> example_fun(int n) for RcppParallel::RVector.

If I understand correctly, exporting an Rcpp::NumericVector makes the data available to R without copying it, as it is essentially R's native data type. What I would like to know, is how exporting a std::vector or an RcppParallel::RVector works internally? Is the vector copied? Is it moved? Does it require type conversions? And importantly, is one of the two options clearly more efficient than the other?

As a quick additional question, I would also like to know, if the Rcpp tread safety issue also applies to vectorized simd loops: #pragma omp simd or #pragma omp parallel for simd?

Thanks.


Solution

  • You may be complicating things for yourself here. It helps to step back, and you can also inspect what R does if/when you play, e.g. the memory profiling options are good!

    In a nutshell, R uses SEXP types and these have 'native' integer and double vectors -- which you can access from R as integer(3) and double(4) to create and allocate, respectively, a three and four element vector.

    Now, using Rcpp::IntegerVector and Rcpp::NumericVector does the absolutely to the bit equivalent step. It uses R's own allocator, and the resulting object is indistinguishable to R from one created in R. (It's of course the same via the C API of R itself, Rcpp acceses it.)

    On the other hand, C++ STL objects like std::vector, or contributed types like RVector use memory allocated elsewhere (which helps e.g. with RcppParallel as you note and keeps it thread-safe) so for each of these we must copy into an R data structure. And that really is the gist of it. Data that "already is like R's own" needs no copy. Everything else needs a copy.

    (As a total aside there is no C++20 anywhere in your example. Not even C++11. What you wrote would have built ten+ years ago when C++98 (!!) was still the default. Life is much better now under current R and current compilers so I hardly ever set standards plugins as C++14 or C++17 are already the default if you are on current-enough systems.)