Search code examples
rrcpp

What's the best practice setting a default value to a const reference DataFrame argument in Rcpp?


In a Rcpp function foo, the argument data may be a big dataframe so that I'd like use it as a const reference. Now I want to set the default value of it as an empty dataframe, so that users can simply call foo() but not foo(data.frame()) in R.

I've tried this:

#include <Rcpp.h>
using namespace Rcpp;

//[[Rcpp::export]]
DataFrame foo(const DataFrame &data = DataFrame())
{
    //Do something
}

But it does nothing, it seems DataFrame() is not seen as a default value. And I've tried to overload the function by this:

#include <Rcpp.h>
using namespace Rcpp;

//[[Rcpp::export]]
DataFrame foo(const DataFrame &data)
{
    //Do something
}

//[[Rcpp::export]]
DataFrame foo()
{
    DataFrame data = DataFrame::create();
    return foo(data);
}

But Rcpp tells me there's a conflicting declaration. And I've tried Nullable:

#include <Rcpp.h>
using namespace Rcpp;

//[[Rcpp::export]]
DataFrame foo(Nullable<DataFrame> data_ = R_NilValue)
{
    DataFrame data;
    if (data_.isNotNull())
    {
        data = DataFrame(data_);
    }
    else
    {
        data = DataFrame::create();
    }
    // Do something
}

It works well this time, but I wonder:

  1. Is this the best (simplest) way to do this?
  2. In this case, will data passed as copy or reference?

Or any other solutions?


Solution

  • "If it doesn't work, try something simpler."

    A DataFrame is a great type, but it isn't a native SEXP object so that makes creating Rcpp infrastructure hard(er). A DataFrame really is just a list of (equal length) vectors. So on return, I mostly create DataFrame at the very last step, treating the different columns (that constistute it) as the primary object. And ditto on the way in: get the columns out, work on those vector.

    I tried to accumulate a little bit of documentation in the RcppExamples packages, and at the Rcpp Gallery. But that is it.

    (Lastly, "how does it come in ?" that is generally not a worry as Rcpp always interfaces R via .Call() which takes (zero or more) SEXP and returns a SEXP. So you are always in the realm of pointers and not full outsize objects being copied.

    If you have ideas (and better, code) to make things better we are likely going to be all ears.