Search code examples
rggplot2shapesscatter-plot

R: Using ggplot, how to make scatterplot with different shapes in 2 separate variables?


I would like to make the tissues be both different colors and shapes, such as heart be a purple square, liver be a green triangle, and lung be an orange diamond. Additionally, I also like to have treatment be a different shape, specifically in closed or open shape. Thus, a control heart would be an open purple square. I'm having difficulty because I want different shape parameters to apply to 2 different variables. Below is reproducible code to achieve scatterplot below. Would appreciate any guidance!

library(tidyverse)
ex <- data.frame(tissue=rep(c('lung','heart', 'liver'),each=10),
                 treatment=rep(rep(c('smoking','control'),5),each=3),
                 value1=rnorm(30) + rep(c(3,1,4,2,5),each=6),
                  value2=rnorm(30) + rep(c(30,11,43,21,15), each = 6))

ggplot(ex, aes(value1, value2)) +
  geom_point(size = 2, aes(col = tissue, shape = treatment))+
  scale_color_manual(values = c("#7030a0", "#548235", "#c55a11")) +
  scale_shape_manual(values = c(21, 19))

Example PCA plot


Solution

  • Since you have two layers of variables, I suggest using interaction() to create their combination, then set scale/legend values accordingly.

    library(ggplot2)
    
    ggplot(ex, aes(value1, value2)) +
      geom_point(size = 2, aes(col = interaction(tissue, treatment), shape = interaction(tissue, treatment))) +
      scale_color_manual(values = c("heart.control" = "#7030a0", 
                                    "heart.smoking" = "#7030a0",
                                    "liver.control" = "#548235", 
                                    "liver.smoking" = "#548235", 
                                    "lung.control" = "#c55a11",
                                    "lung.smoking" = "#c55a11"),
                         labels = c("heart", "liver", "lung"),
                         breaks = c("heart.control", "liver.control", "lung.control")) +
      scale_shape_manual(values = c("heart.control" = 15, 
                                    "heart.smoking" = 22,
                                    "liver.control" = 17, 
                                    "liver.smoking" = 24, 
                                    "lung.control" = 18,
                                    "lung.smoking" = 23),
                         labels = c("control", "treatment"),
                         breaks = c("heart.control", "heart.smoking")) +
      labs(color = "Tissue", shape = "treatment") +
      guides(guides(colour = guide_legend(override.aes = list(fill = c("#7030a0", "#548235", "#c55a11"), shape = c(22, 24, 23)))))