Search code examples
rggplot2dplyrgeom-text

R: Calculate and display percentages using dplyr and geom_text




df <- data.frame(Language = factor(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), levels = 1:2, labels = c("GER", "ENG")),
                 Agegrp =   factor(c(1, 2, 3, 1, 2, 4, 1, 2, 3, 2, 3, 3, 3, 3, 1, 1, 2, 1, 1, 4), levels = c( 1, 2, 3, 4), labels = c("10-19", "20-29", "30-39", "40+")) 
                 ) 
  

df %>% ggplot(aes(x = Agegrp, fill = Language)) + 
  geom_bar(position = 'dodge') +
  labs(title = "Age-structure between German and English",
       y = "Number of persons")
              

Using the above sample data I can create the following plot. But

  • how can I calculate the percentages of each agegroup within each language (using dplyr) and
  • how can I do the same plot with percentages (y-axis should be percentages)?

enter image description here

In this example the percentages are very easy to see as both languages have the same number of cases (10) but this does not necessarily have to be the case with real data. Thank you for help!


Solution

  • To calculate percentage of each Agegrp within a Language you can try -

    library(dplyr)
    library(ggplot2)
    
    df %>%
      count(Agegrp, Language) %>%
      group_by(Language) %>%
      mutate(n = prop.table(n)) %>%
      ungroup %>%
      ggplot(aes(x = Agegrp, y = n, fill = Language)) + 
      geom_col(position = 'dodge') +
      scale_y_continuous(labels = scales::percent) + 
      labs(title = "Age-structure between German and English",
           y = "Percentage of persons")
    

    enter image description here