Search code examples
rstatistics-bootstrap

Error when bootstrapping large n with boot package (error: integer overflow)


Why can I not bootstrap a statistic with large n using the boot package? Although, 150,000 obs is not large, so I don't know why this isn't working.

Example

library(boot)

bs <- boot(rnorm(150000), sum, R = 1000)
bs

ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = rnorm(150000), statistic = sum, R = 1000)


Bootstrap Statistics :
WARNING: All values of t1* are NA

Error Message

In statistic(data, i[r, ], ...) : integer overflow - use sum(as.numeric(.))


Solution

  • You're not using boot() as documented (which is, admittedly, surprisingly complex). From ?boot:

    In all other cases ‘statistic’ must take at least two arguments. The first argument passed will always be the original data. The second will be a vector of indices, frequencies or weights which define the bootstrap sample.

    I think you want:

    bsum <- function(x,i) sum(x[i])
    bs <- boot(rnorm(150000), bsum, R = 1000)
    

    I haven't taken the time to figure out what boot() is actually doing in your case - almost certainly not what you want though.