I am trying to simulate variables knowing their marginal distribution and their correlation matrix. I know we can use packages like copula but I am not familiar on how to go about it. Can someone help
#mean(w1)=0.6, sd(w1)=0.38; w1 is normally distributed
#mean(w2)=0.31; w2 is binary
#mean(w3)=0.226; w3 is binary
cor
w1 w2 w3
w1 1.0000000 -0.3555066 -0.1986376
w2 -0.3555066 1.0000000 0.1030849
w3 -0.1986376 0.1030849 1.0000000
Drawing from the answer here: https://stackoverflow.com/a/10540234/6455166
library(copula)
set.seed(123)
myCop <- normalCopula(param = c(-0.46, -0.27, 0.18),
dim = 3, dispstr = "un")
out <- rCopula(1e5, myCop)
out[, 1] <- qnorm(out[, 1], mean = 0.6, sd = 0.38)
out[, 2] <- qbinom(out[, 2], size = 1, prob = 0.31)
out[, 3] <- qbinom(out[, 3], size = 1, prob = 0.226)
cor(out)
# [,1] [,2] [,3]
# [1,] 1.0000000 -0.3548863 -0.1943631
# [2,] -0.3548863 1.0000000 0.1037638
# [3,] -0.1943631 0.1037638 1.0000000
colMeans(out)
# [1] 0.5992595 0.3118300 0.2256000
sd(out[, 1])
# [1] 0.3806173
Explanation. We draw correlated uniforms, and then convert each vector of uniforms to our desired distributions. The values for the param
argument in normalCopula
were arrived at through trial and error: start with your desired correlations (i.e. c(-0.3555, -0.1986, 0.103)
), then adjust them until cor(out)
produces your target correlations.