Search code examples
rrcpprcpp11

Rcpp::DataFrame::create is limited by 20 arguments?


We are creating the following dataframe inside an Rcpp function:

  Rcpp::DataFrame res =
    Rcpp::DataFrame::create(
     Rcpp::Named("A")=a
    ,Rcpp::Named("B")=b
    ,Rcpp::Named("C")=c
    ,Rcpp::Named("D")=d
    ,Rcpp::Named("E")=e
    ,Rcpp::Named("F")=f
    ,Rcpp::Named("G")=g
    ,Rcpp::Named("H")=h
    ,Rcpp::Named("I")=i
    ,Rcpp::Named("J")=j
    ,Rcpp::Named("K")=k
    ,Rcpp::Named("L")=l
    ,Rcpp::Named("M")=m
    ,Rcpp::Named("N")=n
    ,Rcpp::Named("O")=o
    ,Rcpp::Named("P")=p
    ,Rcpp::Named("Q")=q
    ,Rcpp::Named("R")=r
    ,Rcpp::Named("S")=s
    ,Rcpp::Named("T")=t
    ,Rcpp::Named("U")=u
  );

This dataframe is intended as a returned result. However it can't compile because of the following error:

error: no matching function for call to Rcpp::DataFrame_Impl<Rcpp::PreserveStorage>::create
In file included from local/lib64/R/library/Rcpp/include/Rcpp/DataFrame.h:97:0,
                 from local/lib64/R/library/Rcpp/include/Rcpp.h:57,
                 from file54f121e6a937.cpp:1:
local/lib64/R/library/Rcpp/include/Rcpp/generated/DataFrame_generated.h:142:23: note: template<class T1, class T2, class T3, class T4, class T5, class T6, class T7, class T8, class T9, class T10, class T11, class T12, class T13, class T14, class T15, class T16, class T17, class T18, class T19, class T20> static Rcpp::DataFrame_Impl<StoragePolicy> Rcpp::DataFrame_Impl<StoragePolicy>::create(const T1&, const T2&, const T3&, const T4&, const T5&, const T6&, const T7&, const T8&, const T9&, const T10&, const T11&, const T12&, const T13&, const T14&, const T15&, const T16&, const T17&, const T18&, const T19&, const T20&) [with T1 = T1; T2 = T2; T3 = T3; T4 = T4; T5 = T5; T6 = T6; T7 = T7; T8 = T8; T9 = T9; T10 = T10; T11 = T11; T12 = T12; T13 = T13; T14 = T14; T15 = T15; T16 = T16; T17 = T17; T18 = T18; T19 = T19; T20 = T20; StoragePolicy = Rcpp::PreserveStorage]
 static DataFrame_Impl create( const T1& t1, const T2& t2, const T3& t3, const T4& t4, const T5& t5, const T6& t6, const T7& t7, const T8& t8, const T9& t9, const T10& t10, const T11& t11, const T12& t12, const T13& t13, const T14& t14, const T15& t15, const T16& t16, const T17& t17, const T18& t18, const T19& t19, const T20& t20 ) {
                       ^
local/lib64/R/library/Rcpp/include/Rcpp/generated/DataFrame_generated.h:142:23: note:   template argument deduction/substitution failed:
file54f121e6a937.cpp:771:7: note:   candidate expects 20 arguments, 21 provided

It works fine with 20 arguments. How do we overcome this problem? Thanks


Solution

  • Yes. This is covered in many different places...

    Off the top of my head:

    https://cran.r-project.org/web/packages/Rcpp/vignettes/Rcpp-FAQ.pdf#page=17

    In essence, and in order to be able to compile it with the largest number of compilers, Rcpp is constrained by the older C++ standards which do not support variadic function arguments. So we actually use macros and code generator scripts to explicitly enumerate arguments, and that number has to stop at some limit. We chose 20.


    The approach to use to create a data.frame with more than 20 columns is to build a list, then coerce to data.frame.

    Sample code:

    #include <Rcpp.h>
    
    // [[Rcpp::export]]
    Rcpp::List dynamic_df(Rcpp::DataFrame df) {
    
      // Number of variables in data.frame
      int num_vars = df.ncol();
    
      // Instantiate list with p variable entries
      Rcpp::List long_list(num_vars);
    
      // Make a variable to name columns
      Rcpp::CharacterVector namevec(num_vars);
    
      // Copy from data.frame into list.
      for (int i=0;i < num_vars; ++i) {
        long_list[i] = df(i); // Move vector from data frame to list
        namevec[i] = i;
      }
    
      // Add colnames
      long_list.attr("names") = namevec;
    
      // Coerce list to data.frame
      long_list.attr("row.names") = Rcpp::IntegerVector::create(NA_INTEGER, df.nrow());
      long_list.attr("class") = "data.frame";
    
      // Return result.. Will appear as data.frame
      return long_list;
    }
    
    /*** R
    
    head(dynamic_df(mtcars))
    
    */
    

    Output:

    head(dynamic_df(mtcars))
    #      0 1   2   3    4     5     6 7 8 9 10
    # 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4  4
    # 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4  4
    # 3 22.8 4 108  93 3.85 2.320 18.61 1 1 4  1
    # 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3  1
    # 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3  2
    # 6 18.1 6 225 105 2.76 3.460 20.22 1 0 3  1
    

    Though, you should really consider using the List_Builder class by Kevin in the duplicate entry.

    c.f.

    how many vectors can be added in DataFrame::create( vec1, vec2 ... )?