Search code examples
rggplot2cumsum

Ordering geom_col NOT by fill value


Please resist your instinct to jump at defining factors level. I am trying to make a bar plot with text annotations. I'm using geom_col with a y value aesthetic, and I'm using geom_text with a separate dataframe where the value has been converted into a cumulative sum. The order matters here, I want to plot based on the same order in which cumulative sum is calculated.

Example

library(ggplot2)
library(data.table)

example_df <- data.frame(gender = c('M', 'F', 'F', 'M'), month = c('1', '1', '2', '2'), 
           value = c(10, 20, 30, 40), name = c('Jack', 'Kate', 'Nassrin', 'Malik'))
setDT(example_df)
text_df <- example_df[, .(value=cumsum(value), name=name), by='month']

ggplot(example_df) + geom_col(aes(x=month, y=value, fill=gender)) +
  geom_text(data=text_df, aes(x=month, y=value, label=name), vjust=1)

example plot

If you can see here, the left side is exactly what I want. Jack is labeled at 10 over the M color, Kate labeled 20 above that over the F color. The right side though is wrong. Nassrin is labeled at 30, but over the M color that is of height 40. This is because geom_col by default orders by fill, which is converted to a factor in alphabetic order. What I want here is for the left plot to be ordered M, F but the right one F, M. Is this possible? Or is my best solution to reorder my cumulative sum (which would lead to a different plot than I intend).


Solution

  • Set group and fill separately. The order of stacking (i.e. the position) is controlled by group, and when you don't define that it gets set automatically (in this case the definition of fill is used). So:

    ggplot(example_df) + 
      geom_col(aes(x=month, y=value, group = fct_rev(fct_inorder(name)), fill = gender)) +
      geom_text(data=text_df, aes(x=month, y=value, label=name), vjust=1)
    

    enter image description here

    Note that we can also let ggplot do the cumulative sums for us. Then we can use just the original data.frame, simplifying your plot to:

    ggplot(example_df, aes(month, value, group = fct_rev(fct_inorder(name)),)) + 
      geom_col(aes(fill = gender)) +
      geom_text(aes(label = name), position = 'stack', vjust = 1)