Search code examples
rmontecarlosimulationx

Copula and simulation of binary and continuous variables


I am trying to simulate variables knowing their marginal distribution and their correlation matrix. I know we can use packages like copula but I am not familiar on how to go about it. Can someone help

#mean(w1)=0.6, sd(w1)=0.38; w1 is normally distributed
#mean(w2)=0.31; w2 is binary
#mean(w3)=0.226; w3 is binary

cor
           w1         w2         w3
w1  1.0000000 -0.3555066 -0.1986376
w2 -0.3555066  1.0000000  0.1030849
w3 -0.1986376  0.1030849  1.0000000

Solution

  • Drawing from the answer here: https://stackoverflow.com/a/10540234/6455166

    library(copula)
    set.seed(123)
    myCop <- normalCopula(param = c(-0.46, -0.27, 0.18), 
                          dim = 3, dispstr = "un")
    out <- rCopula(1e5, myCop)
    out[, 1] <- qnorm(out[, 1], mean = 0.6, sd = 0.38)
    out[, 2] <- qbinom(out[, 2], size = 1, prob = 0.31)
    out[, 3] <- qbinom(out[, 3], size = 1, prob = 0.226)
    
    cor(out)
    #            [,1]       [,2]       [,3]
    # [1,]  1.0000000 -0.3548863 -0.1943631
    # [2,] -0.3548863  1.0000000  0.1037638
    # [3,] -0.1943631  0.1037638  1.0000000
    colMeans(out)
    # [1] 0.5992595 0.3118300 0.2256000
    sd(out[, 1])
    # [1] 0.3806173
    

    Explanation. We draw correlated uniforms, and then convert each vector of uniforms to our desired distributions. The values for the param argument in normalCopula were arrived at through trial and error: start with your desired correlations (i.e. c(-0.3555, -0.1986, 0.103)), then adjust them until cor(out) produces your target correlations.