Search code examples
rrcpp

Rcpp random shuffle: reproducing results across operating systems


I am writing a R package that utilizes the Rcpp package. Until now, I have been working and testing on macOS Catalina with R version 4.0.0. Using set.seed from R, I can get reproducible results from my Mac. However, when testing on a windows 10 machine (also with R 4.0.0) I get a different result (using the same seed).

In the package, I call std::random_shuffle to generate a permutation. I am using the code from the Rcpp website to do this. In C++, it is:

#include<Rcpp.h>

inline int randWrapper(const int n) { return floor(unif_rand()*n); }

// [[Rcpp::export]]
Rcpp::NumericVector randomShuffle(Rcpp::NumericVector a) {

    // clone a into b to leave a alone
    Rcpp::NumericVector b = Rcpp::clone(a);

    std::random_shuffle(b.begin(), b.end(), randWrapper);

    return b;
}

Then it can be sourced from R and called as

a <- 1:8
set.seed(42)
randomShuffle(a)

Running this code I get 8 1 4 2 7 5 3 6 on my Mac and 1 4 3 7 5 8 6 2 on Windows.

At first I thought this might be due to differences in the way unif_rand() is implemented on the two operating systems, but when I tested

#include<Rcpp.h>

// [[Rcpp::export]]
Rcpp::NumericVector rint() {
    Rcpp::NumericVector a(1);
    a[0] = unif_rand();
    return a;
}

I get the same result of 0.914806 on both machines. The same thing happens if I replace unif_rand() with R::runif(0,1). I believe that the results for the Mac are the same for both g++ and clang++.

Any advice or insight on how to fix this (if possible) would be appreciated!


Solution

  • I think you are getting confused because std::random_shuffle() is in fact a standard C++ library function and its RNG is not related to R. So that you are getting different results on macOS and Windows is, I am guessing, due the different C++ libraries.

    This line of thinking is then confirmed by your results of getting the runif() or unif_rand() values. It's just that those values are not used by std::random_shuffle().

    So as @BenBolker hinted, for reproducible results across OSs you may want to stick to R (and Rcpp) functions.