Search code examples
rggridges

How to label the count of each bin within ggridges package?


I have a data frame that simulates the NFL season with 2 columns: team and rank. I am trying to use ggridges to make a distribution plot of the frequency of each team at each rank from 1-10. I can get the plot working, but I'd like to display the count of each team/rank in each bin. I have been unsuccessful so far.

   ggplot(results, 
       aes(x=rank, y=team, group = team)) +
   geom_density_ridges2(aes(fill=team), stat='binline', binwidth=1, scale = 0.9, draw_baseline=T) +
   scale_x_continuous(limits = c(0,11), breaks = seq(1,10,1)) +
   theme_ridges() +
   theme(legend.position = "none") +
   scale_fill_manual(values = c("#4F2E84", "#FB4F14",  "#7C1415", "#A71930", "#00143F", "#0C264C", "#192E6C", "#136677", "#203731"), name = NULL)

Which creates this plot:

enter image description here

I tried adding in this line to get the count added to each bin, but it did not work.

   geom_text(stat='bin', aes(y = team + 0.95*stat(count/max(count)),
                         label = ifelse(stat(count) > 0, stat(count), ""))) +

Not the exact dataset but this should be enough to at least run the original plot:

   results = data.frame(team = rep(c('Jets', 'Giants', 'Washington', 'Falcons', 'Bengals', 'Jaguars', 'Texans', 'Cowboys', 'Vikings'), 1000), rank = sample(1:20,9000,replace = T))

Solution

  • Neilfws' answer is great, but I've always found geom_ridgelines difficult to work with in circumstances like this so I usually recreate them with geom_rect:

    library(dplyr)
    
    results %>%
      count(team, rank) %>%
      filter(rank<=10) %>%
      mutate(team=factor(team)) %>%
      ggplot() +
      geom_rect(aes(xmin=rank-0.5, xmax=rank+0.5, ymin=team, fill=team,
                    ymax=as.numeric(team)+n*0.75/max(n))) +
      geom_text(aes(x=rank, y=as.numeric(team)-0.1, label=n)) +
      theme_ridges() +
      theme(legend.position = "none") +
      scale_fill_manual(values = c("#4F2E84", "#FB4F14",  "#7C1415", "#A71930", 
                                   "#00143F", "#0C264C", "#192E6C", "#136677", 
                                   "#203731"), name = NULL) +
      ylab("team")
    

    As requested

    I especially like the level of fine control I get from geom_rect rather than ridgelines. But you do lose out on the nice bounding line drawn around each ridgeline, so if that's important then go with the other answer.