Search code examples
ggplot2legendboxplotgeom-hline

Order of items in an R graph legend are different depending on different datasets of the same data


Newish to R and Stack Overflow but I wrote some R code to make grouped boxplots for chlorophyll data by season and site while including a water quality criteria reference line. It may not be the most elegant or correct script but it seems to get the job done. However, it will order the items in the legend in different ways depending on what dataset I’m using. See example graphs below. In one, the WQ criteria is listed below the site list and in the other it is placed above the site list. I was hoping to be able to control the order/location of the different items in the legend.

# Load libraries
library(readxl)
library(ggplot2)
library(dplyr)
library(tidyverse)
library(lubridate)
library(cowplot)
library(patchwork)
library(ggtext)
library(NADA)

Site <- c("Site 1","Site 8","Site 10","Site 3")
Season <- c("Win", "Spr", "Sum", "Fall")
Chla <- c(5, 10, 15, 20)

Sample_data <- data.frame(Site, Season, Chla)
newsam <- Sample_data
newsam$Site <- factor(newsam$Site, levels = c("Site 1", "Site 3", "Site 8", "Site 10"))
newsam %>% arrange(desc(Site))

#WQ line under site list
ggplot(Sample_data, aes(x=factor(Season), y=Chla, fill=factor(Site))) + 
  geom_hline(aes(yintercept = 10, linetype = "WQ Criteria"), color = "gray", 
             size =1) +
  geom_hline(aes(yintercept = 0.5, linetype = "Reporting Limit"), color = "red", size = 1) +
  geom_boxplot() +
  stat_summary(fun.y=mean, geom="point", shape=4, size=3, color="black",  
               position = position_dodge2(width = 0.75, preserve = "single")) +
  labs(x = "Season", y = "Chlorophyll α (mg/m<sup>3</sup>)", linetype="", fill="") +
  theme_classic() +
  theme(
    axis.title.x = element_markdown(),
    axis.title.y = element_markdown() 
  ) +
  scale_x_discrete(limits = c("Win", "Spr", "Sum", "Fall")) +
  scale_fill_brewer(palette = "Paired") +
  ggtitle("Keystone Lake Chlorophyll, 2013-2022") +
  theme(plot.title = element_text(hjust = 0.5))

#WQ line over site list
ggplot(newsam, aes(x=factor(Season), y=Chla, fill=factor(Site))) + 
  geom_hline(aes(yintercept = 10, linetype = "WQ Criteria"), color = "gray", 
             size =1) +
  geom_hline(aes(yintercept = 0.5, linetype = "Reporting Limit"), color = "red", size = 1) +
  geom_boxplot() +
  stat_summary(fun.y=mean, geom="point", shape=4, size=3, color="black",  
               position = position_dodge2(width = 0.75, preserve = "single")) +
  labs(x = "Season", y = "Chlorophyll α (mg/m<sup>3</sup>)", linetype="", fill="") +
  theme_classic() +
  theme(
    axis.title.x = element_markdown(),
    axis.title.y = element_markdown() 
  ) +
  scale_x_discrete(limits = c("Win", "Spr", "Sum", "Fall")) +
  scale_fill_brewer(palette = "Paired") +
  ggtitle("Keystone Lake Chlorophyll, 2013-2022") +
  theme(plot.title = element_text(hjust = 0.5))

[Graph with "sample_data"(https://i.sstatic.net/GPqIoRmQ.png)]

[Graph with "newsam"(https://i.sstatic.net/gwsf6hWI.png)]

I tried using guide_legend and override.aes.


Solution

  • You've already done it for the graph with newsam with by making Site a factor and setting the levels:

    newsam$Site <- factor(newsam$Site, levels = c("Site 1", "Site 3", "Site 8", "Site 10"))

    Just do the same thing for the plot with Sample_data, either before calling ggplot or by specifying adding the levels argument to fill=factor(Site) within the aesthetics of ggplot.