Search code examples
rcpp

Rcpp sugar unique of List


I have a list of Numeric Vector and I need a List of unique elements. I tried Rcpp:unique fonction. It works very well when apply to a Numeric Vector but not to List. This is the code and the error I got.

List h(List x){
  return Rcpp::unique(x);
}

Error in dyn.load("/tmp/RtmpDdKvcH/sourceCpp-x86_64-pc-linux-gnu-1.0.0/sourcecpp_272635d5289/sourceCpp_10.so") : unable to load shared object '/tmp/RtmpDdKvcH/sourceCpp-x86_64-pc-linux-gnu-1.0.0/sourcecpp_272635d5289/sourceCpp_10.so': /tmp/RtmpDdKvcH/sourceCpp-x86_64-pc-linux-gnu-1.0.0/sourcecpp_272635d5289/sourceCpp_10.so: undefined symbol: _ZNK4Rcpp5sugar9IndexHashILi19EE8get_addrEP7SEXPREC


Solution

  • Thank you for being interested to this issue. As I notified that, my List contains only NumericVector. I propose this code that works very well and faster than unique function in R. However its efficiency decreases when the list is large. Maybe this can help someone. Moreover, someone can also optimise this code.

    List uniqueList(List& x) {
      int xsize = x.size();
    
      List xunique(x);
      int s = 1;
      for(int i(1); i<xsize; ++i){
        NumericVector xi = x[i];
        int l = 0;
        for(int j(0); j<s; ++j){
          NumericVector xj = x[j];
    
          int xisize = xi.size();
          int xjsize = xj.size();
          if(xisize != xjsize){
            ++l;
          }
          else{
            if((sum(xi == xj) == xisize)){
              goto notkeep;
            }
            else{
              ++l;
            }
          }
        }
    
        if(l == s){
          xunique[s] = xi;
          ++s;
        }
    
        notkeep: 0;
      }
      return head(xunique, s);
    }
    
    /***R
    x <- list(1,42, 1, 1:3, 42)
    uniqueList(x)
    [[1]]
    [1] 1
    
    [[2]]
    [1] 42
    
    [[3]]
    [1] 1 2 3
    
    
    microbenchmark::microbenchmark(uniqueList(x), unique(x))
    Unit: microseconds
             expr   min    lq    mean    median  uq    max    neval
     uniqueList(x) 2.382 2.633  3.05103  2.720 2.8995 29.307   100
     unique(x)      2.864 3.110 3.50900  3.254 3.4145 24.039   100
    

    But R function becomes faster when the List is large. I am sure that someone can optimise this code.