Search code examples
rggplot2density-plotiris-datasetgeom-vline

Adjust color of geom_vline in ggplot based on group


So I am creating density plot for Sepal.Length for two species of the iris-dataset. The graph looks like this:

enter image description here

As you can see, I added two measures of central tendency to the graph (Median and Mean). However, right now they are colored depending on the measure (i.e., Median = black, Mean = grey).

I want to change the colors so that mean and median for Setosa/Versicolor have the same color. Furthermore, I want to add a legend indicating color and linetype for mean/median of setosa/versicolor. So in the end I would have a legend with four parts: Median Setosa, Mean Setosa, Median Versicolor, Mean Versiocolor.

Does anyone know how to do this? Please see code for reproduction below:

library(ggplot2)
library(tidyverse)

iris <- iris %>%
  filter(Species == c("setosa", "versicolor"))

temp <- iris %>% 
  group_by(Species) %>%
  dplyr::summarize(Mean = mean(Sepal.Length, na.rm=TRUE))

temp_2 <- iris %>% 
  group_by(Species) %>%
  dplyr::summarize(Median = median(Sepal.Length, na.rm=TRUE))

ggplot(iris, aes(x = Sepal.Length, fill = factor(Species))) +
  geom_density(alpha = 0.5) +
  theme_minimal() +
  geom_vline(data=temp, aes(xintercept=Mean, color="Mean"),
             linetype="dotted", 
             lwd=1) +
  geom_vline(data=temp_2, aes(xintercept=Median, color="Median"),
             linetype="dashed", 
             lwd=1) +
  scale_color_manual(name = "Statistics", values = c(Median = "black", Mean = "grey50")) + 
  labs(title = "Distribution of Sepal.Length",
       x = "Sepal.Length",
       y = "Density",
       fill = "Species") 

Solution

  • This is relatively complex. You would need to map the linetype and color of the lines to the interaction of Species and a string denoting median or mean, than specify manual scales appropriately:

    ggplot(iris, aes(x = Sepal.Length, fill = factor(Species))) +
      geom_density(alpha = 0.5) +
      theme_minimal() +
      geom_vline(data=temp, aes(xintercept=Mean, 
                                color = interaction(Species, 'Mean'),
                                linetype = interaction(Species, 'Mean')),
                 lwd = 1, key_glyph = draw_key_path) +
      geom_vline(data=temp_2, aes(xintercept=Median, 
                                  color = interaction(Species, 'Median'),
                                  linetype = interaction(Species, 'Median')),
                 lwd=1,  key_glyph = draw_key_path) +
      scale_linetype_manual(values = c(setosa.Mean = 'dotted', 
                                       setosa.Median = 'dashed',
                                       versicolor.Mean = 'dotted',
                                       versicolor.Median = 'dashed'),
                            name = 'Averages') +
      scale_color_manual(values = c(setosa.Mean = "#F8766D", 
                                       setosa.Median = "#F8766D",
                                       versicolor.Mean = "#00BFC4",
                                       versicolor.Median = "#00BFC4"),
                         name = 'Averages') +
      labs(title = "Distribution of Sepal.Length",
           x = "Sepal.Length",
           y = "Density",
           fill = "Species") +
      theme(legend.key.width = unit(15, 'mm'))
    

    enter image description here