Search code examples
pythonnumpyprobabilitysampling

python sampling from different distributions with different probability


I am trying to implement a fucntion which returns 100 samples from three different multivariate gaussian distributions.

numpy provides a way to sample from a sinle multivariate gaussian. But I could not find a way to sample from three different multivariate with different sampling probability.

My requirement is to sample with probability $[0.7, 0.2, 0.1]$ from three multivariate gaussians with mean and covariances as given below

G_1  mean = [1,1] cov =[ [ 5, 1] [1,5]]
G_2  mean = [0,0] cov =[ [ 5, 1] [1,5]]
G_3  mean = [-1,-1] cov =[ [ 5, 1] [1,5]]

Any idea ?


Solution

  • Say you create an array of your generators:

    generators = [
        np.random.multivariate_normal([1, 1], [[5, 1], [1, 5]]),             
        np.random.multivariate_normal([0, 0], [[5, 1], [1, 5]]), 
        np.random.multivariate_normal([-1, -1], [[5, 1], [1, 5]])]
    

    Now you can create a weighted random of generator indices, since np.random.choice supports weighted sampling:

    draw = np.random.choice([0, 1, 2], 100, p=[0.7, 0.2, 0.1])
    

    (draw is a length-100 array of entries, each from {0, 1, 2} with probability 0.7, 0.2, 0.1, respectively.)

    Now just generate the samples:

    [generators[i] for i in draw]