Search code examples
rgraphicsr-corrplot

How to draw line around significant values in R's corrplot package


I have been asked to obtain a correlation plot for a colaborator. My choice is to use R for the task, specifically the corrplot package. I have been researching on the internet and I found multiple ways to obtain such graphics, but not the specific graphic I was asked for (as you can see in the picture the significant values are highlighted by drawing a square around the significant tile), which is puzzling me.

Example of the correlation plot required

The closest result I achieve is using the code under this lines, but I do not seem to be able to find the option to draw line around the significant tiles (if exists).

#Insignificant correlations are leaved blank
corrplot(res3$r, type="upper", order="hclust", 
         p.mat = res3$P, sig.level = 0.01, insig = "blank")

I tried adding the "addrect" parameter but it didn't work.

#Insignificant correlation are crossed
corrplot(res3$r, type="upper", order="hclust", p.mat = res3$P,
         addrect=2, sig.level = 0.01, insig = "blank")

Any help will be appreciated.


Solution

  • corrplot allows you to add new plots to an already existing one. Therefore, once you've created the plot of the initial correlation matrix, you can simply add those cells that you want to highlight in an iterative manner using corrplot(..., add = TRUE).

    The only thing required to achieve your goal is an indices vecor (which I called 'ids') to tell R which cells to highlight. Note that for reasons of simplicity, I took a random sample of the initial correlation matrix, but things like ids <- which(p.value < 0.01) (assuming that you've stored your significance levels in a separate vector) would work similarly.

    library(corrplot)
    
    ## create and visualize correlation matrix
    data(mtcars)
    M <- cor(mtcars)
    
    corrplot(M, cl.pos = "n", na.label = " ")
    
    ## select cells to highlight (e.g., statistically significant values)
    set.seed(10)
    ids <- sample(1:length(M), 15L)
    
    ## duplicate correlation matrix and reject all irrelevant values
    N <- M
    N[-ids] <- NA
    
    ## add significant cells to the initial corrplot iteratively 
    for (i in ids) {
      O <- N
      O[-i] <- NA
      corrplot(O, cl.pos = "n", na.label = " ", addgrid.col = "black", add = TRUE, 
               bg = "transparent", tl.col = "transparent")
    }
    

    corrplot

    Note that you could also add all values to highlight in one go (i.e., without requiring a for loop) using corrplot(N, ...), but in that case, an undesirable black margin is drawn all around the plotting area.