I am trying to compute the mean of a variable using a simple, recursive implementation:
m <- 0 # initialize mean
for(irep in 0:999){
# new data point
new_data <- rnorm(1,2,1)
# recursive formula for sample mean
m = (irep/(irep+1)) * m + (1/(irep+1)) * new_data
}
Here, m
will quickly converge to 2, which corresponds to the mean of the normal distribution where we generate new data points from. Implementing something similar in Rcpp:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
double rec_mean(int sample_size){
double m = 0; //initialize mean
for(int irep = 0; irep < sample_size; irep++){
// new data
double new_data = R::rnorm(2,1);
// mean recursive update
m = ((irep)/(irep+1)) * m + (1/(irep+1)) * new_data;
}
return m;
}
This code does not show the expected behavior. Instead, it returns the initial value. Can somebody enlighten me where my error in this translation from R to Rcpp is?
In this line:
m = ((irep)/(irep+1)) * m + (1/(irep+1)) * new_data;
You are dividing an int
by another int
twice. In C++, integer division returns another integer, discarding the remainder.
To get what you want, force the divisions to be done in floating point:
m = ((irep)/(irep+1.0)) * m + (1.0/(irep+1.0)) * new_data;