Search code examples
rscatter-plotpairwise

Correlation pairs plot: different point colors for groups and density scatterplot


I want to compare different groups( a and b) of data by their correlation, like a1 vs a2 and a1 vs b2 etc. I'm generating a correlation scatterplot using the pairs()function.

myData.df <- data.frame(a1=rnorm(100),a2=rnorm(100),b1=rnorm(100),b2=rnorm(100))
upper.panel<-function(x, y){
  points(x,y, pch=20,col=alpha("mediumorchid4", 0.4))
  lmod <- lm(y~x)
  modsum <- summary(lmod)
  r2 <- modsum$adj.r.squared
  r2label = bquote(italic(R)^2 == .(format(r2, digits = 2)))
  
  usr <- par("usr")
  on.exit(par(usr))
  par(usr = c(0, 1, 0, 1))
  text(0.5, 0.9, r2label)
}
pairs(myData.df, lower.panel = NULL, upper.panel = upper.panel)

I wish to color the scatterplot points by different colors depending they are within group or across group comparisons. , i.e when comparing within groups ai vs aj, then I'll color the points with red, bi vs bj in blue and ai vs bj in purple etc.

It is also OK to have differently colored backgrounds for the plots too.

Alternatively is there anyway to draw pairwise density scatterplots like using smoothScatter() or IDPmisc::iplot() ?

Thanks


Solution

  • You can get a high degree of customization with ggpairs from GGally. I'm sure if you look at the docs you will find you can adapt it to your needs. Here I'll use it to color the points as per your request.

    For the specific use case you have however, you will need to stack several copies of the data frame with various NA columns to allow for a coloring variable to be added. This only involves a little extra code.

    library(GGally)
    
    myData.df <- data.frame(a1 = rnorm(100), a2 = rnorm(100), b1 = rnorm(100), b2 = rnorm(100))
    
    plot_data <- do.call(rbind, lapply(seq(ncol(combn(ncol(myData.df), 2))),
                                       function(i) 
                                       { 
                                          myData.df[combn(4, 2)[, i]] <- NA
                                          myData.df$col <- letters[i]
                                          myData.df
                                        }))
    
    ggpairs(plot_data, 1:4, mapping = aes(color = col)) + 
      scale_color_manual(values = c("blue", rep("purple", 4), "red")) 
    

    Created on 2020-07-10 by the reprex package (v0.3.0)