Search code examples
rggplot2tidyversefactoring

Reordering multiple columns by fill with count and legend reorder


Im having issues reordering columns across different groups in a dataset by the relative counts in each group. The dataset in tibble form is below. It has 3 groups with different device types in each group and a frequency count:

library(tidyverse) 
library(ggplot2)

dy2 <- tibble(generation = c("All Devices","All Devices","All Devices","All Devices","All Devices","All Devices", 
                      "First Gen", "First Gen","First Gen","First Gen","First Gen","First Gen",
                      "Subsequent Gen","Subsequent Gen","Subsequent Gen","Subsequent Gen","Subsequent Gen"),
       device_type = as.factor(c("Accessories", "Aspiration_catheter", "Guidewire","Microcatheter", "Sheath", "Stentretriever",
                                 "Accessories", "Aspiration_catheter", "Guidewire","Microcatheter", "Sheath", "Stentretriever",
                                 "Accessories", "Aspiration_catheter", "Guidewire", "Sheath", "Stentretriever")),
       N = c(6,36,26,4,18,39,3,20,17,4,8,14,3,16,9,10,25))

When I plot the dataset in ggplot, I am trying to get the order of each device type in the different groups to be arranged by increasing N, and to have a geom_text above . I have only been able to get this to work for the first group ("All Devices"). The code for the plot is below:

dy2 %>% 
  ggplot(aes(x= generation, y= N, fill= reorder(device_type,N, function(x){sum(x)}))) +
  geom_bar(position= position_dodge(), alpha= 0.85, stat = "identity")+
  geom_text(data= ~ subset(.x, generation %in% c("All Devices")), position=position_dodge(0.9), aes(y= N+0.8, label= N), size= 3, show_guide= FALSE)+
  geom_text(data= ~ subset(.x, generation %in% c("First Gen")), position=position_dodge(0.9), aes(y= N+0.8, label= N), size= 3, show_guide= FALSE)+ 
  geom_text(data= ~ subset(.x, generation %in% c("Subsequent Gen")), position=position_dodge(0.9), aes(y= N+0.8, label= N), size= 3, show_guide= FALSE)+
  scale_fill_manual(name= NULL,
                    values = c("blue", "black", "red", "green3", "cyan4", "purple"),
                    breaks = c("Accessories", "Aspiration_catheter", "Guidewire",
                               "Microcatheter", "Sheath", "Stentretriever"),
                    labels = c("Accessories", "Aspiration catheter", "Guidewire",
                               "Microcatheter", "Sheath", "Stentretriever")) +
  #scale_x_discrete(breaks= c("All Devices", "First Gen", "Subsequent Gen"),
                 #  labels= c("All<br>Devices", "First<br>Gen", "Sub<br>Gen"))+
  theme_classic()

which gives the plot: plot1

As you can see, the order of the device type columns for "First Gen" and "Subsequent Gen" are not correct, while the geom_text of the N above each column is in the correct position but doesnt match up with the associated column.

I have been playing with factoring the dataset as well as different reorder commands, all to no avail.

This also has not worked to rearrange the legend for the fill by the order of "All Device" group, no matter how I try to arrange the breaks in scale_fill_manual.

I am sure theres some factoring issue that I'm missing but any help would be much appreciated.


Solution

  • One option would be to make use of a helper column

    1. arrange your data by generation and N
    2. create a helper column. I simply paste generation and device_type together.
    3. Set the levels of the helper column in the order of the dataset using e.g. forcats::fct_inorder
    4. Map the helper column on the group aes
    library(dplyr)
    library(forcats)
    library(ggplot2)
    
    dy2 <- dy2 %>%
      arrange(generation, N) %>%
      mutate(
        device_type2 = paste(generation, device_type, sep = "_"),
        device_type2 = fct_inorder(device_type2)
      )
    
    ggplot(dy2, aes(x = generation, y = N, fill = device_type, group = device_type2)) +
      geom_bar(position = position_dodge(), alpha = 0.85, stat = "identity") +
      geom_text(position = position_dodge(0.9), aes(y = N + 0.8, label = N), size = 3, show.legend = FALSE) +
      scale_fill_manual(
        name = NULL,
        values = c("blue", "black", "red", "green3", "cyan4", "purple"),
        breaks = c(
          "Accessories", "Aspiration_catheter", "Guidewire",
          "Microcatheter", "Sheath", "Stentretriever"
        )
      ) +
      theme_classic()