Search code examples
rprobabilitycopula

Calculating Conditional Distributions using Copula in R


I have three random variables X, Z, and Y, and I know their distribution functions F(X), F(Z), and F(Y). However, the dependence structure among them is not known. I want to calculate the conditional distributions F(Y|X) and F(X|Y) given X, Z, and Y.

I'm using the copula and gamlss.dist packages in R to generate a Gumbel copula with theta=2 and create a multivariate distribution (mvdc). The code snippet below generates a specific probability using the generated copula:

pacman::p_load(copula, gamlss.dist)

# Generate Gumbel copula with theta=2
thetaVal <- 2
copula <- gumbelCopula(thetaVal, dim = 3)

copPre <- mvdc(copula, c("SHASH", "norm", "norm"),
               list(
                 list(mu = 46, sigma = 5, nu = 2, tau = 1),  # X SHASH
                 list(mean = 46, sd = 1),                      # Y normal
                 list(mean = 23, sd = 1)))                     # Z normal

pMvdc(c(47, 45, 22), copPre)

The last line of code gives the probability that (X ≤ 47),(Y ≤ 45), and (Z ≤ 22). I would like guidance on calculating the conditional distributions F(Y|X) and F(X|Y) using the provided values for X, Z, and Y and the copula model.

Thanks in advance for your assistance!


Solution

  • To get a conditional probability like Pr(X <= x | Y <= y), you can use the formula Pr(X <= x, Y <= y) / Pr(Y <= y):

    library(copula)
    library(gamlss.dist)
    
    # Generate Gumbel copula with theta=2
    thetaVal <- 2
    copula <- gumbelCopula(thetaVal, dim = 3)
    
    copPre <- mvdc(copula, c("SHASH", "norm", "norm"),
                   list(
                     list(mu = 46, sigma = 5, nu = 2, tau = 1),  # X SHASH
                     list(mean = 46, sd = 1),                      # Y normal
                     list(mean = 23, sd = 1)))                     # Z normal
    
    pMvdc(c(47, 45, 22), copPre)
    
    # Pr(X <= x | Y <= 45)
    x <- 50
    pMvdc(c(x, 45, Inf), copPre) / pnorm(45, 46, 1)
    # or:
    pMvdc(c(x, 45, Inf), copPre) / pMvdc(c(Inf, 45, Inf), copPre)
    

    For Pr(X <= x | Y = y) you have to use the cCopula function, which is, to me, a bit complicated to understand.

    # Pr(X <= x | Y = y)
    x <- 50
    y <- 47
    cCopula(
      cbind(pnorm(y, 46, 1), pSHASH(x, 46, 5, 2, 1)), 
      indices = 2, 
      copula = copula
    )
    

    Some info.