Dear fellow programmers,
in order to write a bootstrapped regression version for a seminar at university I tried to implement my R-version into Rcpp (this is for comparison of R to C++/Rcpp). However, I'm stuck with the error messages I receive, since I don't really understand those (the struggle of being new to C++ and especially Armadillo).
Here's the code I'm using. The first function is one I copied from the internet to get proper row-subsets of my matrix (in order to implement a proper non-parametric bootstrap):
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::mat testrows(arma::mat x, arma::Col idx) {
arma::mat xsub;
xsub = x.rows(idx-1);
return xsub;
}
Furthermore, here's the code I wrote for my actual bootstrap version:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::mat betaBoot(arma::colvec& y, const arma::mat& X, const int nboot) {
int n = X.n_rows, k = X.n_cols;
arma::mat betaHat(k,nboot);
for(int i = 0; i < nboot; i++){
Rcpp::NumericVector colId = Rcpp::runif(n, 0, (n-1));
arma::mat X_boot = testrows(X, colId);
arma::colvec y_boot = y(colId);
betaHat.col(i) = arma::solve(X_boot, y_boot);
}
return betaHat;
}
betaHat
is a matrix to contain the bootstrapped coefficient-vectors (in each column), the matrix should have dimension k x nboot. X_boot
(within the loop) should be the bootstrapped data and y_boot
the corresponding dependent observations. colId
should be a random index for the bootstrapping procedure. Finally, betaHat
should be returned.
attached you find a picture of the errors I receive when using sourceCpp.
Maybe it's something simple I just can't see or maybe the lack of experience, however learning this stuff would be great. So if you could help me with that, that would be great. Thank you in advance Erin
Edit: The way my bootstrapped regression function now looks (and works):
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::mat betaBoot(arma::colvec& y, const arma::mat& X, const int nboot) {
int n = X.n_rows, k = X.n_cols;
arma::mat betaHat(k,nboot);
for(int i = 0; i < nboot; i++){
arma::uvec colId = arma::randi<arma::uvec>(n, arma::distr_param(0,n-1));
arma::mat X_boot = X.rows(colId);
arma::colvec y_boot = y(colId);
betaHat.col(i) = arma::solve(X_boot, y_boot);
}
return betaHat;
}
Thanks to all the helpful comments I received.
From comments; the function testrows
takes an arma::Col
for the indices but an Rcpp::NumericVector
is passed in betaBoot
, hence error. You are also subsetting with a real (random uniform) rather than an integer index. Rcpp
and RcppArmadillo
provide sample
functions which can be used here. (Also from the armadillo help pages should be uvec).
You can produce the indices to select rows using Rcpp
sugar functions:
Rcpp::IntegerVector x = Rcpp::seq(0, n-1);
arma::uvec idx = Rcpp::as<arma::uvec>(Rcpp::sample(x, n, true)) ;
Or you can do it directly with arma::randi:
arma::uvec idx = arma::randi<arma::uvec>(n, arma::distr_param(0,n-1));
Some code with the second method, omitting testrows
as "I found that I do not need my (stolen) helper
function testrows":
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::mat betaBoot( arma::mat X, arma::vec y, int nboot){
int n = X.n_rows, k = X.n_cols;
arma::mat betaHat(k, nboot);
for(int i=0; i < nboot; i++){
arma::uvec idx = arma::randi<arma::uvec>(n, arma::distr_param(0,n-1));
arma::mat X_boot = X.rows(idx);
arma::vec y_boot = y.elem(idx);
betaHat.col(i) = arma::solve(X_boot, y_boot);
}
return betaHat;
}
/***R
set.seed(1)
n = 25
nc = 2
x = cbind(1, matrix(rnorm(n*nc), nc=nc))
y = rnorm(n)
sim = betaBoot(x, y, 2000)
*/