Search code examples
rstatistics-bootstrap

R simple Bootstrap


I have a data frame (Applications) with two columns

Customer    Application
1           1
1           0
1           0
1           1
1           1
1           0
1           1
1           0
1           0
1           1
1           1

Where the application rate is

sum(Applications$Application)/sum(Applications$Customer).

I've been asked to bootstrap this application rate by taking running 1000 samples of 1000 customers to get distribution and confidence level for the application rate. I tried using the boot package as follows

f2 <- function(Loan,Customer){sum(Applications$Application)/sum(Applications$Customer)}
bootapp1 <-(boot(Applications, f2, 1000))
bootapp1

ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = Bootstrap_Test, statistic = f2, R = 1000)


Bootstrap Statistics :
       original  bias    std. error
t1* 0.003052608       0           0

Obviously this isn't what I'm looking for as it doesn't give any bias or standard error.

Would anyone be able to tell me a quick way of getting the results I need. I imagine there must be a really simple way of doing it.


Solution

  • You just need to tweak your function, which needs two arguments. From the help file on boot, under the argument statistic:

    A function which when applied to data returns a vector containing the statistic(s) of interest. When sim = "parametric", the first argument to statistic must be the data. For each replicate a simulated dataset returned by ran.gen will be passed. In all other cases statistic must take at least two arguments. The first argument passed will always be the original data. The second will be a vector of indices, frequencies or weights which define the bootstrap sample.

    library(boot)
    x <- structure(list(Customer = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                                     1L, 1L), Application = c(1L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 
                                                              1L, 1L)), .Names = c("Customer", "Application"), class = "data.frame", row.names = c(NA, 
                                                                                                                                                   -11L))
    f2 <- function(x, index){sum(x[index, "Application"])/sum(x[index, "Customer"])}
    bootapp1 <- boot(data = x, statistic = f2, R = 1000)
    > bootapp1
    
    ORDINARY NONPARAMETRIC BOOTSTRAP
    
    
    Call:
      boot(data = x, statistic = f2, R = 1000)
    
    
    Bootstrap Statistics :
      original       bias    std. error
    t1* 0.5454545 0.0005454545     0.14995