Search code examples
rsamplesapply

Using sapply to sample with pre-defined probability


I'm using sample function with pre-defined probability.

I made this code and it worked fine. However, there is no way to check I've done my job right. Would anybody check my work and evaluate it?

df <- structure(list(A=c("A","B","C","D","E","F","G"),
                     probs=c(0.2,0.4,0.6,0.8,0.3,0.7,0.9)),
                Names = c("name","probs"), class = "data.frame", row.names = c(1:7))

df$pred<-sapply(df$probs,function(x) sample(c("Yes","No"),1,prob=c(x,1-x),replace=TRUE))

In df, probs is pre-defined probability of saying "yes". I used sapply with each probs, and applied sample function.


Solution

  • A way to check this would be to increase the sample size and check the proportion.

    n <- 1e6
    set.seed(123)
    sapply(df$probs,function(x) 
              table(sample(c("Yes","No"),n,prob=c(x,1-x),replace=TRUE))/n)
    
    
    #       [,1]     [,2]    [,3]     [,4]     [,5]     [,6]     [,7]
    #No  0.80006 0.599886 0.40003 0.200072 0.699906 0.299314 0.100044
    #Yes 0.19994 0.400114 0.59997 0.799928 0.300094 0.700686 0.899956
    

    As we can see that all the "Yes" values is almost same as df$probs we can say that what we have is correct.