Search code examples
rstatisticscorrelationweighted

Calculating a weighted correlation using different vector weights for each variable in R


I would like to calculate a weighted correlation between two variables having different weights.

Some example data:

DF = data.frame(
  x = c(-0.3, 0.3, -0.18, 0.02, 0.07, 0.11, 0.20, 0.8, 0.3, -0.4),
  x_weight = c(50, 40, 70, 5, 15, 30, 32, 13, 9, 19),
  y = c(-0.6, 0.25, 0.1, 0.3, 0.3, -0.05, -0.5, 1, 0.05, -0.6),
  y_weight = c(70, 8, 10, 39, 9, 49, 90, 77, 23, 75)
)
DF

I read about cov.wt in the stats package, but it only allows input for one vector of weights. Essentially I'm looking for similar inputs as wtd.t.test, but to calculate a correlation instead.

Thank you for your help!


Solution

  • You can calculate the weighted correlation between two variables using the following formula Formulas are based on the definition of weighted covariance and correlation.

    First, calculate the weighted means for the two variables using the weights:

    mu_x = sum(DF$x * DF$x_weight) / sum(DF$x_weight)
    mu_y = sum(DF$y * DF$y_weight) / sum(DF$y_weight)
    

    Next, calculate the weighted covariance between the two variables:

    cov_xy = sum((DF$x - mu_x) * (DF$y - mu_y) * DF$x_weight * DF$y_weight) / sum(DF$x_weight * DF$y_weight)
    

    Finally, calculate the weighted correlation between the two variables:

    cor_xy = cov_xy / (sqrt(sum((DF$x - mu_x)^2 * DF$x_weight) / sum(DF$x_weight)) * sqrt(sum((DF$y - mu_y)^2 * DF$y_weight) / sum(DF$y_weight)))