Search code examples
rrcpp

Rcpp R vector size limit (negative length vectors are not allowed)


As per https://stackoverflow.com/a/48676389/3846213 and https://stackoverflow.com/a/5234293/3846213 R vectors are limited 2^31 - 1 items. However, I've been able to trigger the "negative length vectors are not allowed" error (which is supposed to be a sign of trying to allocate an overly large vector) at half that number via Rcpp. This all comes from my trying to debug an Rcpp-based R package (https://github.com/tpq/propr/issues/13).

library(Rcpp)

cppFunction("
IntegerVector test(int size) {
    int veclen = size * (size - 1) / 2;
    IntegerVector vec(veclen);
    return vec;
}
")

vec <- test(47000)
Error in test(47000) : negative length vectors are not allowed

47000^2 / 2 is nearly half of 2^31. I've got no such problems in pure R, that is vec <- 1:(47000*(47000-1)/2) runs fine, so there should be something special with Rcpp.


Solution

  • The problem is multiplication overflow. When you do

    size * (size - 1) / 2
    

    order of operations bites you, because

    size * (size - 1)
    

    can overflow even if the overall expression doesn't. We can see this by adding a printing statement:

    IntegerVector test(int size) {
        int veclen = size * (size - 1) / 2;
        Rcpp::Rcout << veclen << std::endl;
        IntegerVector vec(veclen);
        return vec;
    }
    
    vec <- test(47000)
    # -1043007148
    

    So, we can fix it by changing up how we do that operation:

    IntegerVector test(int size) {
        int veclen = (size / 2) * (size - 1);
        Rcpp::Rcout << veclen << std::endl;
        IntegerVector vec(veclen);
        return vec;
    }
    

    which gives no issue

    vec <- test(47000)
    # 1104476500
    str(vec)
    # int [1:1104476500] 0 0 0 0 0 0 0 0 0 0 ...
    

    Update: The problem with odd numbers

    Eli Korvigo brings up an excellent point in the comments about integer division behavior with odd numbers. To illustrate consider calling the function with the even number 4 and the odd number 5

    even <- 4
    odd  <- 5
    
    even * (even - 1) / 2
    # [1] 6
    odd  * (odd  - 1) / 2
    # [1] 10
    

    It should create vectors of length 6 and 10 respectively. But, what happens?

    test(4)
    # 6
    # [1] 0 0 0 0 0 0
    test(5)
    # 8
    # [1] 0 0 0 0 0 0 0 0
    

    Oh no! 5 / 2 in integer division is 2, not 2.5, so this does not quite do what we want in the odd case. However, luckily we can easily address this with a simple flow control:

    IntegerVector test2(int size) {
        int veclen;
        if ( size % 2 == 0 ) {
            veclen = (size / 2) * (size - 1);
        } else {
            veclen = size * ((size - 1) / 2);
        }
        Rcpp::Rcout << veclen << std::endl;
        IntegerVector vec(veclen);
        return vec;
    }
    

    We can see this handles the odd and even cases both just fine:

    test2(4)
    # 6
    # [1] 0 0 0 0 0 0
    test2(5)
    # 10
    # [1] 0 0 0 0 0 0 0 0 0 0