Search code examples
rggplot2statisticsdata-analysis

R - correlation matrix without duplicated values in edges


My data looks like

## data =
##  A B C a b c
##  0 1 0 1 1 0
##  0 0 1 1 0 0
##  1 1 0 0 1 0
##  0 0 1 0 0 1
##  0 1 0 1 1 0
##  1 0 0 0 1 0

How to correlate data for results like this:

##      A    B    C
## a   0.7 -0.2 -0.2 
## b   0.3 -0.5  1.0
## c  -0.7  0.4 -1.0

I'm inspired by this article, and I want to create similar heatmap. But more in this fashion:

enter image description here

Is running cor(data) and then cropping matrix to the desired submatrix is proper approach? Or I should run some other function rather than cor(data)?


Solution

  • Since the desired submatrix is not a block from the diagonal of the whole matrix, I don't think there is some better shortcut and you should use

    cor(M)[c("a", "b", "c"), c("A", "B", "C")]
    #            A          B          C
    # a -0.7071068  0.3333333  0.0000000
    # b  0.5000000  0.7071068 -1.0000000
    # c -0.3162278 -0.4472136  0.6324555
    

    or just cor(M)[4:6, 1:3].