Search code examples
c++rsplitrcpp

Converting R split() function to C++


Consider the reproducible example in R:

test <- c(1:12)
> test
 [1]  1  2  3  4  5  6  7  8  9 10 11 12

The expected result:

test.list <- split(test, gl(2, 3))
> test.list
$`1`
[1] 1 2 3 7 8 9

$`2`
[1]  4  5  6 10 11 12

I am trying to write equivalent code in C++ to produce and return the two vectors that resulted from the test.list. Note that, I am in the embarrassing novice stage in C++.


Solution

  • We can use the nice answer by @jignatius and make it an R-callable function. For simplicity I keep it at NumericVector; we have a boatload of answers here that show show to switch between NumericVector and IntegerVector based on the run-time payload.

    Code

    #include <Rcpp.h>
    
    // [[Rcpp::export]]
    Rcpp::List mysplit(Rcpp::NumericVector nums, int n, int size) {
        std::vector<std::vector<double>> result(n);
        int i = 0;
        auto beg = nums.cbegin();
        auto end = nums.cend();
    
        while (beg != nums.cend()) {
            //get end iterator safely
            auto next = std::distance(beg, end) >= size ? beg + size : end;
            //insert into result
            result[i].insert(result[i].end(), beg, next);
            //advance iterator
            beg = next;
            i = (i + 1) % n;
        }
    
        Rcpp::List ll;
        for (const auto&v : result)
            ll.push_back(v);
    
        return ll;
    }
    
    /*** R
    testvec <- 1:12
    mysplit(testvec, 2, 3)
    */
    

    Output

    > Rcpp::sourceCpp("~/git/stackoverflow/68858728/answer.cpp")
    
    > testvec <- 1:12
    
    > mysplit(testvec, 2, 3)
    [[1]]
    [1] 1 2 3 7 8 9
    
    [[2]]
    [1]  4  5  6 10 11 12
    
    > 
    

    There is a minor error in the original question in that we do not need a call to gl(); just the two scalars are needed.