Search code examples
rextractcorrelation

Extract certain values out of a correlation matrix


Is there a way to distract the correlation coefficients out of a correlation matrix ?

Let's say I have a dataset with 3 variables (a, b, c) and I want to calculate the correlations among themselves.

with


df <- data.frame(a <- c(2, 3, 3, 5, 6, 9, 14, 15, 19, 21, 22, 23),
                 b <- c(23, 24, 24, 23, 17, 28, 38, 34, 35, 39, 41, 43),
                 c <- c(13, 14, 14, 14, 15, 17, 18, 19, 22, 20, 24, 26),
                 d <- c(6, 6, 7, 8, 8, 8, 7, 6, 5, 3, 3, 2))

and

cor(df[, c('a', 'b', 'c')])

I'll get a correlation matrix:

          a         b         c
 a 1.0000000 0.9279869 0.9604329
 b 0.9279869 1.0000000 0.8942139
 c 0.9604329 0.8942139 1.0000000

Is there a way to show the results in a manner like this:

  1. Correlation between a and b is: 0.9279869.
  2. Correlation between a and c is: 0.9604329.
  3. Correlation between b and c is: 0.8942139:

?

My correlation matrix is of obviously bigger (~300 entries) eand I need a way to distract only the values that are important for me.

Thanks.


Solution

  • Using reshape2 and melt

    df <- data.frame("a" = c(2, 3, 3, 5, 6, 9, 14, 15, 19, 21, 22, 23),
                     "b" = c(23, 24, 24, 23, 17, 28, 38, 34, 35, 39, 41, 43),
                     "c" = c(13, 14, 14, 14, 15, 17, 18, 19, 22, 20, 24, 26),
                     "d" = c(6, 6, 7, 8, 8, 8, 7, 6, 5, 3, 3, 2))
    
    tmp=cor(df[, c('a', 'b', 'c')])
    tmp[lower.tri(tmp)]=NA
    diag(tmp)=NA
    
    library(reshape2)
    na.omit(melt(tmp))
    

    resulting in

      Var1 Var2     value
    4    a    b 0.9279869
    7    a    c 0.9604329
    8    b    c 0.8942139