Search code examples
rrcppdowncast

Warning when downcasting in Rcpp?


I have an Rcpp function that should take an IntegerVector as input (as toInt). I want to use it on vector of integers, but also on vector of doubles that are just integers (e.g. 1:4 is of type integer but 1:4 + 1 is of type double).

Yet, when this is used on real floating point numbers (e.g. 1.5), I would like it to return a warning or an error instead of silently rounding all values (to make them integers).

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
IntegerVector toInt(RObject x) {
  return as<IntegerVector>(x);
}


> toInt(c(1.5, 2.4))  # I would like a warning
[1] 1 2

> toInt(1:2 + 1)      # No need of warning
[1] 2 3

Solution

  • The first solution I thought of

    // [[Rcpp::export]]
    IntegerVector toInt2(const NumericVector& x) {
      for (int i = 0; i < x.size(); i++) {
        if (x[i] != (int)x[i]) {
          warning("Uh-oh");
          break;
        }
      }
      return as<IntegerVector>(x);
    }
    

    but I wondered if there wasn't an unnecessary copy when x was an IntegerVector, so I made this other solution:

    // [[Rcpp::export]]
    IntegerVector toInt3(const RObject& x) {
      NumericVector nv(x);
      for (int i = 0; i < nv.size(); i++) {
        if (nv[i] != (int)nv[i]) {
          warning("Uh-oh");
          break;
        }
      }
      return as<IntegerVector>(x);
    }
    

    But, maybe the best solution would be to test if the RObject is already of type int and to fill the resulting vector at the same time of checking the type:

    // [[Rcpp::export]]
    SEXP toInt4(const RObject& x) {
      if (TYPEOF(x) == INTSXP) return x;
    
      NumericVector nv(x);
      int i, n = nv.size();
      IntegerVector res(n);
      for (i = 0; i < n; i++) {
        res[i] = nv[i];
        if (nv[i] != res[i]) {
          warning("Uh-oh");
          break;
        }
      }
      for (; i < n; i++) res[i] = nv[i];
    
      return res;
    }
    

    Some benchmarking:

    x <- seq_len(1e7)
    x2 <- x; x2[1] <- 1.5
    x3 <- x; x3[length(x3)] <- 1.5
    microbenchmark::microbenchmark(
      fprive(x),  toInt2(x),  toInt3(x),  toInt4(x),
      fprive(x2), toInt2(x2), toInt3(x2), toInt4(x2),
      fprive(x3), toInt2(x3), toInt3(x3), toInt4(x3),
      times = 20
    )
    Unit: microseconds
           expr        min         lq         mean     median          uq        max neval
      fprive(x) 229865.629 233539.952 236049.68870 235623.390 238500.4335 244608.276    20
      toInt2(x)  98249.764  99520.233 102026.44305 100468.627 103480.8695 114144.022    20
      toInt3(x)  50631.512  50838.560  52307.34400  51417.296  52524.0260  58311.909    20
      toInt4(x)      1.165      6.955     46.63055     10.068     11.0755    766.022    20
     fprive(x2)  63134.534  64026.846  66004.90820  65079.292  66674.4835  74907.065    20
     toInt2(x2)  43073.288  43435.478  44068.28935  43990.455  44528.1800  45745.834    20
     toInt3(x2)  42968.743  43461.838  44268.58785  43682.224  44235.6860  51906.093    20
     toInt4(x2)  19379.401  19640.198  20091.04150  19918.388  20232.4565  21756.032    20
     fprive(x3) 254034.049 256154.851 258329.10340 258676.363 259549.3530 264550.346    20
     toInt2(x3)  77983.539  79162.807  79901.65230  79424.011  80030.3425  87906.977    20
     toInt3(x3)  73521.565  74329.410  76050.63095  75128.253  75867.9620  88240.937    20
     toInt4(x3)  22109.970  22529.713  23759.99890  23072.738  23688.5365  30905.478    20
    

    So, toInt4 seems the best solution.