Search code examples
c++rprecisionrcpparmadillo

Difference between R's sum() and Armadillo's accu()


There are small differences in the results of R's sum() function and RcppArmadillo's accu() function when given the same input. For example, the following code:

R:

vec <- runif(100, 0, 0.00001)
accu(vec)
sum(vec)

C++:

// [[Rcpp::depends("RcppArmadillo")]]
// [[Rcpp::export]]
double accu(arma::vec& obj)
{
    return arma::accu(obj);
}

Gives the results:

0.00047941851844312633 (C++)

0.00047941851844312628 (R)

According to http://keisan.casio.com/calculator the true answer is:

4.79418518443126270948E-4

These small differences add up in my algorithm and significantly affect the way it executes. Is there a way to more accurately sum up vectors in C++? Or at least to get the same results that R does without having to call R code?


Solution

  • What I have found:

    I successfully managed to write a function which is able to mimic R's sum function. It appears R uses a higher precision variable to store the results of each addition operation.

    What I wrote:

    // [[Rcpp::depends("RcppArmadillo")]]
    // [[Rcpp::export]]
    double accu2(arma::vec& obj)
    {
        long double result = 0;
        for (auto iter = obj.begin(); iter != obj.end(); ++iter)
        {
            result += *iter;
        }
        return result;
    }
    

    How it compares in speed:

    set.seed(123)
    vec <- runif(50000, 0, 0.000001)
    microbenchmark(
      sum(vec),
      accu(vec),
      accu2(vec)
    )
    
    
           expr    min     lq     mean  median      uq    max neval
       sum(vec) 72.155 72.351 72.61018 72.6755 72.7485 75.068   100
      accu(vec) 48.275 48.545 48.84046 48.7675 48.9975 52.128   100
     accu2(vec) 69.087 69.409 70.80095 69.6275 69.8275 182.955  100
    

    So, my c++ solution is still faster than R's sum, however it is significantly slower than armadillo's accu()