Search code examples
rr-corrplot

R: using corrplot to visualize two variables (e.g., correlation and p-value) using the size and colour of the circles


I am trying to recreate someone's image using corrplot. This is the original image I am trying to re-create:

enter image description here

I use the following R-code:

corrplot(as.matrix(rgs), 
         method="circle", 
         type="upper", 
         col=brewer.pal(n=8, name="PuOr"), 
         tl.col="black", 
         tl.srt=45, 
         p.mat = as.matrix(pvalues), 
         sig.level = 0.05, 
         insig = "blank")

Which gives me this:

enter image description here

The problem I have is that the colour as well as the size of the circles in my plot are based on the correlations, but in the original image above the colour of the circles is based on the correlation while the size of the circles is based on the p-values. I have the p-values in a different data frame called pvalues (I actually use that in the code above to determine which circles should be shown and which shouldn't in the bottom 3 lines). My question is: how can I make the colour and size be dependent on two different variables like they did in the original image? Is that even possible using corrplot?


Solution

  • What you want does not seem to be possible with corrplot, unless you hack it a bit. I simply added a new parameter size_vector which is used when drawing the circles. See https://github.com/johannes-titz/corrplot/commit/9362f6a7c2fda794b5ef8895b77f0b2ff979092a for the changed lines.

    # install the hacked version
    devtools::install_github("johannes-titz/corrplot@size_parameter")
    library(corrplot)
    data(mtcars)
    M <- cor(mtcars)
    # get p values
    p_vals_mat <- cor.mtest(mtcars)$p
    corrplot(M, size_vector = 1-as.numeric(p_vals_mat))
    

    enter image description here

    Note that I used 1-p for the size (small p-values produce large circles). You can use any value between 0 and 1 for the size.

    Further note that in the original figure, the relationship between the p-value and the circle size is non-linear. So you might want to use some transformation that comes close to this relationship.

    In any case I would actually advise to not use such figures. p-values are problematic on their own, but plotting them with some kind of transformation does not make much sense to me. The size of the correlation is likely the most important information and the plot does not reflect this. This has some potential for confusions.

    PS: I did not bother to add a legend, but this should not be too difficult to do with legend.

    A small update: The transformation of the p-values might be something like this:

    transform_p <- function(x) {
      y <- 0.91 - (0.82) * (1 - exp(-3.82 * x))
      y
    }
    

    Which will slightly change the size of the circles:

    corrplot(M, size_vector = as.numeric(transform_p((p_vals_mat))))
    

    enter image description here

    Again, I do not recommend it, but it should be a bit closer to the original figure.

    If you just want the upper triangular, only pass the p-values of the upper triangular:

    upper_tri <- p_vals_mat[upper.tri(p_vals_mat, diag = T)]
    corrplot(M, size_vector = transform_p(upper_tri), type = "upper")
    

    enter image description here