Search code examples
pythonnumpystatisticsprobabilitysampling

Is there a numpy function to randomly sample multiple probabilities


I have an array of probabilities that I'd like to sample.

Say p=[0.5,0.9,0.1,0.4]. I'd like to sample an array of 0,1 and length 4 that uses the probability provided.

Essentially using a vectorized version of np.random.choice(2,len(p),p=[1-p,p])


Solution

  • One approach is to generate an array from a uniform distribution and compare it against the values in p:

    import numpy as np
    
    p = [0.5, 0.9, 0.1, 0.4]
    
    res = (np.random.random(len(p)) < p).astype(np.uint32)
    print(res)
    

    Output (of a single run)

    [0 1 0 0]
    

    As an alternative you can consider each value to be draw from a Bernoulli distribution, and given that (quote):

    The Bernoulli distribution is a special case of the binomial distribution where a single trial is conducted (so n would be 1 for such a binomial distribution)

    you could do:

    p = [0.5, 0.9, 0.1, 0.4]
    res = np.random.binomial(1, p, size=len(p))
    print(res)
    

    Output (of a single run)

    [0 1 0 1]
    

    Note that np.random.binomial accepts as argument p an array. From the documentation (emphasis mine):

    p float or array_like of floats
    Parameter of the distribution, >= 0 and <=1.