Search code examples
rrandombernoulli-probability

Generating Integer Sequences based on a Modified Bernoulli Distribution


I want to use R to randomly generate an integer sequence that each integer is picked from a pool of integers (0,1,2,3....,k) with replacement. k is pre-determined. The selection probability for every integer k in (0,1,2,3....,k) is pk(1-p) where p is pre-determined. That is, 1 has much higher probability to be picked compared to k and my final integer sequence will likely have more 1 than k. I am not sure how to implement this number selecting process in R.


Solution

  • A generic approach to this type of problem would be:

    1. Calculate the p^k * (1-p) for each integer
    2. Create a cumulative sum of these in a table t.
    3. Draw a number from a uniform distribution with range(t)
    4. Measure how far into t that number falls and check which integer that corresponds to.
    5. The larger the probability for an integer is, the larger part of that range it will cover.

    Here's quick and dirty example code:

    draw <- function(n=1, k, p) {
        v <- seq( 0, k )
        pr <- (p ** v) * (1-p)
        t <- cumsum(pr)
        r <- range(t)
        x <- runif( n, min=min(r), max=max(r) )
        f <- findInterval( x, vec=t )
        v[ f+1 ] ## first interval is 0, and it will likely never pass highest interval
    }
    
    

    Note, the proposed solution doesn't care if your density function adds up to 1. In real life it likely will, based on your description. But that's not really important for the solution.