Search code examples
rrcpparmadillorcpparmadillo

Template classes arma::Col in Rcpp Armadilllo


Dear fellow programmers,

in order to write a bootstrapped regression version for a seminar at university I tried to implement my R-version into Rcpp (this is for comparison of R to C++/Rcpp). However, I'm stuck with the error messages I receive, since I don't really understand those (the struggle of being new to C++ and especially Armadillo).

Here's the code I'm using. The first function is one I copied from the internet to get proper row-subsets of my matrix (in order to implement a proper non-parametric bootstrap):

    #include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]

arma::mat testrows(arma::mat x, arma::Col idx) {
  arma::mat xsub;
  xsub = x.rows(idx-1);
  return xsub;
}

Furthermore, here's the code I wrote for my actual bootstrap version:

   #include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::mat betaBoot(arma::colvec& y, const arma::mat& X, const int nboot) {
    int n = X.n_rows, k = X.n_cols;
    arma::mat betaHat(k,nboot);
    for(int i = 0; i < nboot; i++){
        Rcpp::NumericVector colId = Rcpp::runif(n, 0, (n-1));
        arma::mat X_boot = testrows(X, colId);
        arma::colvec y_boot = y(colId);
        betaHat.col(i) = arma::solve(X_boot, y_boot);
    }
    return betaHat;
}

betaHat is a matrix to contain the bootstrapped coefficient-vectors (in each column), the matrix should have dimension k x nboot. X_boot (within the loop) should be the bootstrapped data and y_boot the corresponding dependent observations. colId should be a random index for the bootstrapping procedure. Finally, betaHat should be returned.

attached you find a picture of the errors I receive when using sourceCpp.

Error Messages

Maybe it's something simple I just can't see or maybe the lack of experience, however learning this stuff would be great. So if you could help me with that, that would be great. Thank you in advance Erin

Edit: The way my bootstrapped regression function now looks (and works):

#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::mat betaBoot(arma::colvec& y, const arma::mat& X, const int nboot) {
    int n = X.n_rows, k = X.n_cols;
    arma::mat betaHat(k,nboot);
    for(int i = 0; i < nboot; i++){
        arma::uvec colId = arma::randi<arma::uvec>(n, arma::distr_param(0,n-1));
      
        arma::mat X_boot = X.rows(colId);
        arma::colvec y_boot = y(colId);
        betaHat.col(i) = arma::solve(X_boot, y_boot);
    }
    return betaHat;
}

Thanks to all the helpful comments I received.


Solution

  • From comments; the function testrows takes an arma::Col for the indices but an Rcpp::NumericVector is passed in betaBoot, hence error. You are also subsetting with a real (random uniform) rather than an integer index. Rcpp and RcppArmadillo provide sample functions which can be used here. (Also from the armadillo help pages should be uvec).

    You can produce the indices to select rows using Rcpp sugar functions:

    Rcpp::IntegerVector x = Rcpp::seq(0, n-1);
    arma::uvec idx = Rcpp::as<arma::uvec>(Rcpp::sample(x, n, true)) ;
    

    Or you can do it directly with arma::randi:

    arma::uvec idx = arma::randi<arma::uvec>(n,  arma::distr_param(0,n-1));
    

    Some code with the second method, omitting testrows as "I found that I do not need my (stolen) helper function testrows":

    #include <RcppArmadillo.h>
    
    // [[Rcpp::depends(RcppArmadillo)]]
            
    // [[Rcpp::export]]
    arma::mat betaBoot( arma::mat X, arma::vec y, int nboot){
      
      int n = X.n_rows,  k = X.n_cols;
      arma::mat betaHat(k, nboot);
      
      for(int i=0; i < nboot; i++){
        arma::uvec idx = arma::randi<arma::uvec>(n,  arma::distr_param(0,n-1));
        arma::mat X_boot = X.rows(idx);
        arma::vec y_boot = y.elem(idx);
        betaHat.col(i) = arma::solve(X_boot, y_boot);
      }  
      return betaHat;
    }
    
    /***R
    set.seed(1)
    n = 25
    nc = 2
    x = cbind(1, matrix(rnorm(n*nc), nc=nc))
    y = rnorm(n)
    
    sim = betaBoot(x, y, 2000)
    */