Search code examples
rggplot2violin-plot

Is there a way for adding labels with number of observations per value in violin plot in ggplot?


Image you want to create a violin plot and have data like this:

set.seed(123)
Bodytype <- sample(LETTERS[1:3], 500, replace = T)
Weight <- rnorm(500,40,1)    
df <- data.frame(Bodytype, Weight)
  ggplot(data = df, 
         aes(x = Bodytype, y = Weight, fill = Bodytype)) +
  geom_violin(scale = "count", trim = F, adjust = 0.75) +
  scale_y_continuous(breaks = seq(34, 46, 1)) +
  theme_gray() 

Now I would like to add text label or something for each bodytype at each kg level to see how many observations in each bodytype category that weigh 36 kg, 37kg, etc. Is there are way for achieving this, or am I better off using another plot?


Solution

  • This can be done in many ways, here is one:

    library(dplyr)
    library(ggplot2)
    summ <- df %>%
      group_by(Bodytype) %>%
      summarize(n = n(), Weight = mean(Weight))
    ggplot(data = df, aes(x = Bodytype, y = Weight, fill = Bodytype)) +
      geom_violin(scale = "count", trim = F, adjust = 0.75) +
      scale_y_continuous(breaks = seq(34, 46, 1)) +
      theme_gray() +
      geom_text(aes(label = n), data = summ)
    

    ggplot2 with count labels on the means


    Okay, so you want multiple weight counts:

    weightcounts <- df %>%
      mutate(Weight = as.integer(round(Weight, 0))) %>%
      group_by(Bodytype, Weight) %>%
      count()
    ggplot(data = df, aes(x = Bodytype, y = Weight, fill = Bodytype)) +
      geom_violin(scale = "count", trim = F, adjust = 0.75) +
      scale_y_continuous(breaks = seq(34, 46, 1)) +
      theme_gray() +
      geom_text(aes(label = n), data = weightcounts)
    

    second ggplot2, with per-weight-counts

    Either way, the premise is that you can generate a summary frame with the associated labels you need, and then add geom_text (or geom_label) with the new dataset as an argument.