Search code examples
rrcpp

How to concatenate Lists in Rcpp


I want to c() 2 lists in Rcpp, but I'm struggling to get the same structure as I would in R.

Here is some simple data + example:

rlist = list(a = "123")
listadd = list(typ = "fdb")
c(rlist, listadd)

which gives me this:

$a
[1] "123"

$typ
[1] "fdb"

With Rcpp I only found push_back to do more or less what I want but the structure is a bit different. I also tried to use emplace_back based on this reference but it doesnt seem to be implemented in Rcpp.

cppFunction('
List cLists(List x, List y) {
  x.push_back(y);
  return(x);
}')

which gives me:

cLists(rlist, listadd)
$a
[1] "123"

[[2]]
[[2]]$typ
[1] "fdb"

Based on this question I know that I could use Language("c",x,y).eval(); to use R's c() function and get the correct result, but that doesn't seem to be the right way.

So I was wondering how can I concatenate lists in Rcpp correctly?

EDIT: Based on @Dirk's comment, I tried to create a new list and fill them with the other lists elements, but then I loose the element names.

cppFunction('
List cLists(List x, List y) {
  int nsize = x.size(); 
  int msize = y.size(); 
  List out(nsize + msize);

  for(int i = 0; i < nsize; i++) {
    out[i] = x[i];
  }
  for(int i = 0; i < msize; i++) {
    out[nsize+i] = y[i];
  }
  return(out);
}')

Output:

cLists(rlist, listadd)
[[1]]
[1] "123"

[[2]]
[1] "fdb"

Solution

  • The performance hit for your implementation seems to come from copying the name attribute to stl string vectors. You can avoid it like so:

    library(Rcpp)
    library(microbenchmark)
    cppFunction('
    List cLists(List x, List y) {
      int nsize = x.size(); 
      int msize = y.size(); 
      List out(nsize + msize);
    
      CharacterVector xnames = x.names();
      CharacterVector ynames = y.names();
      CharacterVector outnames(nsize + msize);
      out.attr("names") = outnames;
      for(int i = 0; i < nsize; i++) {
        out[i] = x[i];
        outnames[i] = xnames[i];
      }
      for(int i = 0; i < msize; i++) {
        out[nsize+i] = y[i];
        outnames[nsize+i] = ynames[i];
      }
    
      return(out);
    }')
    
    x <- as.list(runif(1e6)); names(x) <- sample(letters, 1e6, T)
    y <- as.list(runif(1e6)); names(y) <- sample(letters, 1e6, T)
    
    microbenchmark(cLists(x,y), c(x,y), times=3)
    Unit: milliseconds
             expr      min       lq     mean   median       uq      max neval cld
     cLists(x, y) 31.70104 31.86375 32.09983 32.02646 32.29922 32.57198     3  a 
          c(x, y) 47.31037 53.21409 56.41159 59.11781 60.96220 62.80660     3   b
    

    Note: by copying to std::string you're also losing possible character encoding information, whereas working with just R/Rcpp preserves.