Search code examples
pythonnumpydistributionrandom

Sampling from bivariate normal in python


I'm trying to create two random variables which are correlated with one another, and I believe the best way is to draw from a bivariate normal distribution with given parameters (open to other ideas). The uncorrelated version looks like this:

import numpy as np
sigma = np.random.uniform(.2, .3, 80)
theta = np.random.uniform( 0, .5, 80)

However, for each one of the 80 draws, I want the sigma value to be related to the theta value. Any thoughts?


Solution

  • Use the built-in: http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.multivariate_normal.html

    >>> import numpy as np
    >>> mymeans = [13,5]  
    >>> # stdevs = sqrt(5),sqrt(2)
    >>> # corr = .3 / (sqrt(5)*sqrt(2) = .134
    >>> mycov = [[5,.3], [.3,2]]   
    >>> np.cov(np.random.multivariate_normal(mymeans,mycov,500000).T)
    array([[ 4.99449936,  0.30506976],
           [ 0.30506976,  2.00213264]])
    >>> np.corrcoef(np.random.multivariate_normal(mymeans,mycov,500000).T)
    array([[ 1.        ,  0.09629313],
           [ 0.09629313,  1.        ]])
    
    1. As shown, things get a little hairier if you have to adjust for not-unit variances)
    2. more reference: http://www.riskglossary.com/link/correlation.htm
    3. To be real-world meaningful, the covariance matrix must be symmetric and must also be positive definite or positive semidefinite (it must be invertable). Particular anti-correlation structures might not be possible.