Search code examples
rggplot2dplyr

Percentage labels per x-axis group in stacked bar plots


I would like to add percentage labels per group to a plot using ggplot2

Example with overall (over all x-axis groups) percentage labels

# library
library(reshape2)
library(tidyverse)

# make plot
p <- ggplot(tips, aes(x= day,  group=sex)) + 
  geom_bar(aes(y = ..prop.., fill = factor(..group..)), stat="count") +
  geom_text(aes( label = scales::percent(..prop..),
                 y= ..prop.. ), stat= "count", position=position_stack(vjust = 0.5)) +
  labs(y = "Percent", fill="sex") +
  scale_y_continuous(labels = scales::percent)

p

Now I would like to add labels per day (i.e., x-axis group): e.g., 52.6% for 1 and 47.4% for 2 for Fri I know how I could calculate these values, however, I do not know how to do this within the ggplot call.

# make percentage labels per day
tips %>% 
  group_by(day,sex) %>% 
  summarise(total=n()) %>%
  left_join(tips %>% group_by(day) %>% summarise(freq=n()) %>% ungroup()) %>% 
  mutate(percentage=(total/freq))

#> Joining, by = "day"
#> # A tibble: 8 x 5
#> # Groups:   day [4]
#>   day   sex    total  freq percentage
#>   <fct> <fct>  <int> <int>      <dbl>
#> 1 Fri   Female     9    19      0.474
#> 2 Fri   Male      10    19      0.526
#> 3 Sat   Female    28    87      0.322
#> 4 Sat   Male      59    87      0.678
#> 5 Sun   Female    18    76      0.237
#> 6 Sun   Male      58    76      0.763
#> 7 Thur  Female    32    62      0.516
#> 8 Thur  Male      30    62      0.484

Created on 2022-04-02 by the reprex package (v2.0.1)


Solution

  • Update

    Replacing all uses of .. (which was deprecated in ggplot2 3.4.0) by after_stat and using ave() instead of tapply() this can be achieved like so:

    library(ggplot2)
    # For the "tips" dataset
    library(reshape2)
    
    ggplot(tips, aes(x = day, group = sex)) +
      geom_bar(aes(
        y = after_stat(prop),
        fill = factor(after_stat(group))
      ), stat = "count") +
      geom_text(aes(
        label = scales::percent(
          after_stat(ave(count, x, FUN = function(x) x / sum(x)))
        ),
        y = after_stat(prop)
      ), stat = "count", position = position_stack(vjust = 0.5)) +
      labs(y = "Percent", fill = "sex") +
      scale_y_continuous(labels = scales::percent)
    

    enter image description here

    Original Answer

    Using after_stat (instead of ..) and a custom helper function to compute the percentages you could compute the percentages per day and add labels like so. Basically this involves to computing the proportions per x axis group or category:

    library(ggplot2)
    library(reshape2)
    
    comp_pct <- function(count, group) {
      count / tapply(count, group, sum)[group]
    }
    
    ggplot(tips, aes(x = day, group = sex)) +
      geom_bar(aes(y = ..prop.., fill = factor(..group..)), stat = "count") +
      geom_text(aes(
        label = after_stat(scales::percent(comp_pct(count, x))),
        y = ..prop..
      ), stat = "count", position = position_stack(vjust = 0.5)) +
      labs(y = "Percent", fill = "sex") +
      scale_y_continuous(labels = scales::percent)