Search code examples
rggplot2bar-chartfacet-gridgrouped-bar-chart

order bar plots in facet grid in R


I have a barchart and I want to order the bars and calculate percentage of changes from one bar to another and show on the plot. I used factors and tried different methods and everything on stackoverflow, but it gives me all sorts of errors. I managed to order the legend in terms of colors but the bar colours does not have the same order.

and here is my code:

library(ggplot2)

data$Active_period  <- factor(data$Active_period,levels = c("study_year","last_year", "this_year"))
fct_rev(data$Active_period)

ggplot(data, aes(count, Name, fill = Active_period)) +
geom_col(width = 0.8, position = position_dodge2(width = 0.8, preserve = "single"))+
facet_grid(Time ~ site)+
scale_fill_manual(name = " ",values = c("study_year" ="#5d5c5c", "this_year" ="#4475aa", 
"last_year" = "#333239"),limits = c("study_year","last_year", "this_year"))+
theme(legend.position = "bottom", 
          axis.text = element_text(size = 11), 
          axis.title = element_text(size = 11, face = "bold"),
          strip.text.x = element_text(size = 12),
          strip.text.y = element_text(size = 12),
          plot.title = element_text(size = 14, face = "bold"),
          plot.subtitle = element_text(size = 13)))

Here is my data:

Name | Active_period| site |Time | count
A | Last_year| north| mornings |10
A | Last_year| south| mornings |20
A | Last_year| north| evenings|45
A | Last_year| south| evenings|35
A | this_year| north| mornings |80
A | this_year| south| mornings |60
A | this_year| north| evenings|95
A | this_year| south| evenings|120
A | study_year| north| mornings |100
A | study_year| south| mornings |400
A | study_year| north| evenings|220
A | study_year| south| evenings|32
B | Last_year| north| mornings |10
B | Last_year| south| mornings |45
B | Last_year| north| evenings|25
B | Last_year| south| evenings|20
B | this_year| north| mornings |300
B | this_year| south| mornings |250
B | this_year| north| evenings|140
B | this_year| south| evenings|20
B | study_year| north| mornings |10
B | study_year| south| mornings |20
B | study_year| north| evenings|10
B | study_year| south| evenings|20

Solution

  • Here is a solution.
    First, I create a custom theme so that the problem code is simpler. This is not really part of the question and it can be defined before the answer.

    theme_so_q77804382 <- function(){ 
      theme_grey(base_size = 10) %+replace%
        theme(
          legend.position = "bottom", 
          axis.text = element_text(size = 11), 
          axis.title = element_text(size = 11, face = "bold"),
          strip.text.x = element_text(size = 12),
          strip.text.y = element_text(size = 12, angle = -90),
          plot.title = element_text(size = 14, face = "bold"),
          plot.subtitle = element_text(size = 13)
        )
    }
    

    The plot

    The plot is a horizontal bars plot of counts by name so instead of defining the y axis as the names axis, the recommended way is to define the independent variable in the x axis as usual and then reverse with coord_flip.

    Now the problem.

    • first, coerce Active_period to factor;
    • then, compute the percentage changes grouped by Name, site and Time;
    • now that we have the changes from base level study_year, reverse the factor levels to have the plot as asked.

    Then just plot it, with the percentages plotted with geom_text. The text must also be grouped and since fill is not a geom_text aesthetic, use group = Active_period. Play a little with hjust in order to have the minus signs visible whenever the changes are negative.

    suppressPackageStartupMessages({
      library(dplyr)
      library(ggplot2)
    })
    
    data$Active_period[data$Active_period == "Last_year"] <- "last_year"
    
    data %>%
      mutate(Active_period = factor(Active_period,levels = c("study_year","last_year", "this_year"))) %>%
      group_by(Name, site, Time) %>%
      arrange(Active_period) %>%
      mutate(perc = scales::percent(c(0, diff(count))/dplyr::lag(count, default  = 1))) %>%
      ungroup() %>%
      mutate(Active_period = forcats::fct_rev(Active_period)) %>%
      ggplot(aes(Name, count, fill = Active_period)) +
      geom_col(
        width = 0.8, 
        position = position_dodge2(width = 0.8, preserve = "single")
      ) +
      geom_text(
        position = position_dodge2(width = 0.8, preserve = "single"),
        aes(label = perc, group = Active_period),
        hjust = -0.2
      ) +
      scale_fill_manual(
        name = " ", 
        values = c(study_year ="#5d5c5c", this_year = "#4475aa", last_year = "#333239"),
        limits = c("study_year","last_year", "this_year")
      ) +
      coord_flip() +
      facet_grid(Time ~ site) +
      theme_so_q77804382()
    

    Created on 2024-01-12 with reprex v2.0.2


    Edit

    In order to remove the "0%" from the first bars, change the dplyr pipe to

    data %>%
      mutate(Active_period = factor(Active_period,levels = c("study_year","last_year", "this_year"))) %>%
      group_by(Name, site, Time) %>%
      arrange(Active_period) %>%
      mutate(
        perc = scales::percent(c(0, diff(count))/dplyr::lag(count, default  = 1)),
        perc = c("", perc[-1L])
      ) %>%
      ungroup() %>%
      mutate(Active_period = forcats::fct_rev(Active_period)) %>%
    

    The change that removes the number is in the 2nd mutate.