Search code examples
rggplot2graphicsboxplotcowplot

ggplot: Fill Boxplots Using the Package RColorBrewer + Plot Boxplots Using plot_grid() in the Cowplot Package in R


Overview

I have a two data frames called 'ANOVA.Dataframe.1' and 'ANOVA.Dataframe.2' (see below).

For this project, I have two aims:

(1) Fill the boxplots using the package RColorBrewer;

(2) Plot the boxplots using the package Cowplot

Issues

  • In the first instance, I generated two objects called New.filled.Boxplot.obs1.Canopy.Urban, and New.filled.Boxplot.obs2.Canopy.Urban, and I added the function (i.e function 1 or function 2 - see R-code below) that generated the boxplots with the function scale_fill_brewer(palette="Dark2") found in the RColorBrewer package by following this example to produce the desired results. However, my code did not work (see image below).

  • When I plotted the boxplots using plot_grid() in the Cowplot package, the positioning of the label headings (i.e A: Observation Period 1 or B: Observation Period 2 - see image below) overlay both boxplots (see image below). Is there a method to manipulate the plotting space in the plot window so the boxplots are very slightly smaller and the label headings are positioned above each boxplot instead?

If anyone can be of assistance, I would be deeply appreciative.

R-Code

library(tidyverse)
library(wrapr)
library(RColorBrewer)
library(dplyr)

# Open Colour Brewer Paletter Options
display.brewer.all()


## Function 1 to produce the boxplots for Dataframe 1

Boxplot.obs1.Canopy.Urban<-ANOVA.Dataframe.1 %.>%
                                   ggplot(data = ., aes(
                                   x = Urbanisation_index,
                                   y = Canopy_Index,
                                   group = Urbanisation_index,
                                   )) +
                                   stat_boxplot(
                                   geom = 'errorbar',
                                   width = .25
                                   ) +
                                   geom_boxplot(notch=T) +
                                   geom_line(
                                   data = group_by(., Urbanisation_index) %>%
                                   summarise(
                                   bot = min(Canopy_Index),
                                   top = max(Canopy_Index)
                                    ) %>%
                                   gather(pos, val, bot:top) %>% 
                                   select(
                                   x = Urbanisation_index,
                                   y = val
                                   ) %>%
                                   mutate(gr = row_number()) %>%
                                   bind_rows(
                                   tibble(
                                   x = 0,
                                   y = max(.$y) * 1.15,
                                   gr = 1:8
                                   )
                                   ),
                                  aes(
                                  x = x,
                                  y = y,
                                  group = gr
                                  )) +
                                  theme_light() +
                                  theme(panel.grid = element_blank()) +
                                  coord_cartesian(
                                  xlim = c(min(.$Urbanisation_index) - .5, max(.$Urbanisation_index) + .5),
                                  ylim = c(min(.$Canopy_Index) * .95, max(.$Canopy_Index) * 1.05)
                                   ) +
                                 ylab('Company Index (%)') +
                                 xlab('Urbanisation Index')

 ## Change the colours of the boxplot
New.filled.Boxplot.obs1.Canopy.Urban <- Boxplot.obs1.Canopy.Urban + scale_fill_brewer(palette="Dark2")

 

## Function 2 to produce the boxplots for Dataframe 2
Boxplot.obs2.Canopy.Urban<-ANOVA.Dataframe.2 %.>%
                                   ggplot(data = ., aes(
                                   x = Urbanisation_index,
                                   y = Canopy_Index,
                                   group = Urbanisation_index,
                                   )) +
                                   stat_boxplot(
                                   geom = 'errorbar',
                                   width = .25
                                   ) +
                                   geom_boxplot(notch=T) +
                                   geom_line(
                                   data = group_by(., Urbanisation_index) %>%
                                   summarise(
                                   bot = min(Canopy_Index),
                                   top = max(Canopy_Index)
                                    ) %>%
                                   gather(pos, val, bot:top) %>% 
                                   select(
                                   x = Urbanisation_index,
                                   y = val
                                   ) %>%
                                   mutate(gr = row_number()) %>%
                                   bind_rows(
                                   tibble(
                                   x = 0,
                                   y = max(.$y) * 1.15,
                                   gr = 1:8
                                   )
                                   ),
                                  aes(
                                  x = x,
                                  y = y,
                                  group = gr
                                  )) +
                                  theme_light() +
                                  theme(panel.grid = element_blank()) +
                                  coord_cartesian(
                                  xlim = c(min(.$Urbanisation_index) - .5, max(.$Urbanisation_index) + .5),
                                  ylim = c(min(.$Canopy_Index) * .95, max(.$Canopy_Index) * 1.05)
                                   ) +
                                 ylab('Company Index (%)') +
                                 xlab('Urbanisation Index')


## Change the colours of the boxplot

 New.filled.Boxplot.obs2.Canopy.Urban<- Boxplot.obs2.Canopy.Urban + scale_fill_brewer(palette="Dark2")

 

library(cowplot)

## Open New plot window
dev.new()

Combined_boxplot_Obs<-plot_grid(New.filled.Boxplot.obs1.Canopy.Urban, 
                                New.filled.Boxplot.obs2.Canopy.Urban, 
                                labels=c("A: Observation Period 1",
                                         "B: Observation Period 2"),
                                label_fontface="bold",
                                label_fontfamily="Times New Roman",
                                label_size=12,
                                align="v",
                                ncol=2, nrow=1)

Combined_boxplot_Obs

This R-code produces these plots:

enter image description here

Data frame 1

structure(list(Urbanisation_index = c(2, 2, 4, 4, 3, 3, 4, 4, 
4, 2, 4, 3, 4, 4, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 
2, 2, 2, 4, 4, 3, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1, 4, 4, 4, 
4, 4, 4, 4), Canopy_Index = c(65, 75, 55, 85, 85, 85, 95, 85, 
85, 45, 65, 75, 75, 65, 35, 75, 65, 85, 65, 95, 75, 75, 75, 65, 
75, 65, 75, 95, 95, 85, 85, 85, 75, 75, 65, 85, 75, 65, 55, 95, 
95, 95, 95, 45, 55, 35, 55, 65, 95, 95, 45, 65, 45, 55)), row.names = c(NA, 
-54L), class = "data.frame")

Dataframe 2

structure(list(Urbanisation_index = c(2, 2, 4, 4, 3, 3, 4, 4, 
4, 3, 4, 4, 4, 4, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 
2, 2, 2, 4, 4, 3, 2, 2, 2, 2, 2, 2, 1, 1, 4, 4, 4, 4, 4, 4, 4
), Canopy_Index = c(5, 45, 5, 5, 5, 5, 45, 45, 55, 15, 35, 45, 
5, 5, 5, 5, 5, 5, 35, 15, 15, 25, 25, 5, 5, 5, 5, 5, 5, 15, 25, 
15, 35, 25, 45, 5, 25, 5, 5, 5, 5, 55, 55, 15, 5, 25, 15, 15, 
15, 15)), row.names = c(NA, -50L), class = "data.frame")

Solution

    1. The scale_fill_brewer(palette = "Dark2") does not work in your example, because you don not provide a fill-aesthetics. You need to add that to your boxplot.
    2. The labels in plot_grid are meant to be single letters (or at least short) for reference in a caption. For your purpose I'd recommend to use titles in the original plots.
    3. Your code is quite hard to read and you can reduce the number of packages used. I also shortend the name as they are not so important here and make everything more verbose.
    4. I would calculate special statistics not inside the ggplot-call, but before that in a separate data.frame.

    Packages

    library(tidyverse)
    library(cowplot)
    

    1st Boxplots

    # Calculate special positions for lines first
    mydf.1.lines <- mydf.1 %>% 
      group_by(Urbanisation) %>%
      summarise(bot = min(Canopy), top = max(Canopy)) %>%
      gather(pos, val, bot:top) %>% 
      select(x = Urbanisation, y = val) %>%
      mutate(gr = row_number()) %>%
      bind_rows(tibble(x = 0, y = max(.$y) * 1.15, gr = 1:8))
    
    # Calculate plot limits 
    xlimits.1 <- with(mydf.1, c(min(Urbanisation) - .5, max(Urbanisation) + .5))
    ylimits.1 <- with(mydf.1, c(min(Canopy) * .95, max(Canopy) * 1.05))
    
    Boxplot.1 <- 
      ggplot(mydf.1, aes(Urbanisation, Canopy, group = Urbanisation)) +
      stat_boxplot(geom = 'errorbar', width = .25) +
      # Add a fill aesthetics in the geom_boxplot - call:
      geom_boxplot(aes(fill = factor(Urbanisation)), notch = TRUE) +
      geom_line(data = mydf.1.lines, 
                aes(x, y, group = gr)) +
      theme_light() +
      theme(panel.grid = element_blank()) +
      coord_cartesian(xlim = xlimits.1, ylim = ylimits.1) +
      ylab('Company Index (%)') +
      xlab('Urbanisation Index')
    
    New.filled.Boxplot.1 <- Boxplot.1 + scale_fill_brewer(palette = "Dark2")
    

    2nd Boxplots
    Analogous to the 1st:

    mydf.2.lines <- mydf.2 %>% 
      group_by(Urbanisation) %>%
      summarise(bot = min(Canopy), top = max(Canopy)) %>%
      gather(pos, val, bot:top) %>% 
      select(x = Urbanisation, y = val) %>%
      mutate(gr = row_number()) %>%
      bind_rows(tibble(x = 0, y = max(.$y) * 1.15, gr = 1:8))
    
    xlimits.2 <- with(mydf.2, c(min(Urbanisation) - .5, max(Urbanisation) + .5))
    ylimits.2 <- with(mydf.2, c(min(Canopy) * .95, max(Canopy) * 1.05))
    
    Boxplot.2 <- 
      ggplot(mydf.2, aes(Urbanisation, Canopy, group = Urbanisation)) +
      stat_boxplot(geom = 'errorbar', width = .25) +
      geom_boxplot(aes(fill = factor(Urbanisation)), notch = TRUE) +
      geom_line(data = mydf.2.lines, 
                aes(x, y, group = gr)) +
      theme_light() +
      theme(panel.grid = element_blank()) +
      coord_cartesian(xlim = xlimits.2, ylim = ylimits.2) +
      ylab('Company Index (%)') +
      xlab('Urbanisation Index')
    
    New.filled.Boxplot.2 <- Boxplot.2 + scale_fill_brewer(palette = "Dark2")
    

    Combine Plots

    plot_grid(New.filled.Boxplot.1 + ggtitle("A: Observation Period 1"),
              New.filled.Boxplot.2 + ggtitle("B: Observation Period 2"), 
              align = "v",
              ncol = 2,
              nrow = 1)
    

    Or with the correct specification of the title and hjust (Thanks to Claus Wilke):

    plot_grid(New.filled.Boxplot.1 + ggtitle(""),
              New.filled.Boxplot.2 + ggtitle(""), 
              align = "v",
              labels = c("A: Observation Period 1", "B: Observation Period 2"),
              hjust = 0, 
              label_x = 0.01,
              ncol = 2,
              nrow = 1)
    

    enter image description here

    Boxplot outside of plot
    The problem here is that the notches are outside the hinges. If you set notch = FALSE for the second plot (or both) it is no problem. Alternatively you could also manipulate the ylimits as you already suggested. The function with simply specifies the data.frame (mydf.2) in which the following columns can be found. Thus the call

    ylimits.2 <- with(mydf.2, c(min(Canopy) * .95, max(Canopy) * 1.05))
    

    is equivalent to

    ylimits.2 <- c(min(mydf.2$Canopy) * .95, max(mydf.2$Canopy) * 1.05)
    

    and you could for example specify

    ylimits.2 <- c(-20, max(mydf.2$Canopy) * 1.05)
    

    This would set the lower limit to -20 and the upper limit to 1.05 times the maximum of the Canopy index in the second dataframe.

    Data

    mydf.1 <- 
      structure(list(Urbanisation = c(2, 2, 4, 4, 3, 3, 4, 4, 4, 2, 4, 3, 4, 4, 1, 
                                      1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 
                                      2, 2, 4, 4, 3, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 
                                      2, 1, 4, 4, 4, 4, 4, 4, 4), 
                     Canopy = c(65, 75, 55, 85, 85, 85, 95, 85, 85, 45, 65, 75, 75, 
                                65, 35, 75, 65, 85, 65, 95, 75, 75, 75, 65, 75, 65, 
                                75, 95, 95, 85, 85, 85, 75, 75, 65, 85, 75, 65, 55, 
                                95, 95, 95, 95, 45, 55, 35, 55, 65, 95, 95, 45, 65, 
                                45, 55)), 
                row.names = c(NA, -54L), class = "data.frame")
    
    mydf.2 <- 
      structure(list(Urbanisation = c(2, 2, 4, 4, 3, 3, 4, 4, 4, 3, 4, 4, 4, 4, 1, 
                                      1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 
                                      2, 2, 4, 4, 3, 2, 2, 2, 2, 2, 2, 1, 1, 4, 4, 
                                      4, 4, 4, 4, 4), 
                     Canopy = c(5, 45, 5, 5, 5, 5, 45, 45, 55, 15, 35, 45, 5, 5, 5, 
                                5, 5, 5, 35, 15, 15, 25, 25, 5, 5, 5, 5, 5, 5, 15, 
                                25, 15, 35, 25, 45, 5, 25, 5, 5, 5, 5, 55, 55, 15, 
                                5, 25, 15, 15, 15, 15)), 
                row.names = c(NA, -50L), class = "data.frame")