Search code examples
rcorrelationcranprobability-distribution

Copula result in R


I have a table of two column, it consist of an already computed index for 2 variables, a simple is quoted as following:

 V1, V2
 0.46,1.08
 0.84,1.05
-0.68,0.93
-0.99,0.68
-0.87,0.30
-1.08,-0.09
-1.16,-0.34
-0.61,-0.43
-0.65,-0.48
 0.73,-0.48

In order to find out the correlation between the aforementioned data I have, I am using the copula package in R.

The following VineCopula code I have used to figure out which family of Copula to use:

library(VineCopula)
selectedCopula <- BiCopSelect(u,v,familyset=NA)
selectedCopula

It has suggested to use the survival Gumbel, the rotated version of the Gumbel Copula according to the copula R manual (Link)

However, I chose The Frank copula, since it offers symmetric dependence structure, and it permits modeling positive as negative dependence in the data, how plausible is that?

One more thing, after running the following self explanatory copula code:


# Estimate V1 distribution parameters and visually compare simulated vs observed data
x_mean <- mean(mydata$V1)
#Normal Distribution
hist(mydata$V1, breaks = 20, col = "green", density = 30)
hist(rnorm( nrow(mydata), mean = x_mean, sd = sd(mydata$V1)), 
breaks = 20,col = "blue", add = T, density = 30, angle = -45)

# Same for V2
y_mean <- mean(mydata$V2)
#Normal Distribution
hist(mydata$V2, breaks = 20, col = "green", density = 30)
hist(rnorm(nrow(mydata), mean = y_mean,sd = sd(mydata$V2)), 
breaks = 20, col = "blue", add = T, density = 30, angle = -45)


# Measure association using Kendall's Tau
cor(mydata, method = "kendall")


#Fitting process with copula choice
# Estimate copula parameters
cop_model <- frankCopula(dim = 2)
m <- pobs(as.matrix(mydata))
fit <- fitCopula(cop_model, m, method = 'ml')
coef(fit)

# Check Kendall's tau value for the frank copula with  = 3.236104 
tau(frankCopula(param = 3.23))

#Building the bivariate distribution using frank copula

# Build the bivariate distribution
sdx =sd(mydata$V1)
sdy =sd(mydata$V2)
my_dist <- mvdc(frankCopula(param = 3.23, dim = 2), margins = c("norm","norm"), 
                paramMargins = list(list(mean = x_mean, sd=sdx), 
                                    list(mean = y_mean, sd=sdy)))

# Generate 439 random sample observations from the multivariate distribution
v <- rMvdc(439, my_dist)
# Compute the density
pdf_mvd <- dMvdc(v, my_dist)
# Compute the CDF
cdf_mvd <- pMvdc(v, my_dist)

# Sample 439 observations from the distribution
sim <- rMvdc(439,my_dist)

# Plot the data for a visual comparison
plot(mydata$V1, mydata$V2, main = 'Test dataset x and y', col = "blue")
points(sim[,1], sim[,2], col = 'red')
legend('bottomright', c('Observed', 'Simulated'), col = c('blue', 'red'), pch=21)

The plotted data set shows good fitting results even for extreme values.

here, I want to present the correlated values from applying frank copula with my original data in the same line graph, I could not figure out how to extract the frank copula results? (A one column so I can plot with the original data and have a visual comparison)


Solution

  • I am not sure if I correctly understand your questions. However, if you want to get the copula data (generated from Frank copula) they are stored in sim. If you are asking for the Kendall tau then they should be stored in the fitcopula. You cannot have a frank copula data as one column as it must be a matrix. Also, pobs function will give you a result as a matrix so you do not need to use as.matrix. If you need more help, I am very happy to help.