Search code examples
rggplot2data-visualizationtidyverseggrepel

use ggrepel to display mean value as a text label


I am creating a plot where I want to display mean value. I have managed to display both the mean and its corresponding value, but I find the plots to be too cluttered with the mean values thrown in, so I wanted to use ggrepel::geom_label_repel to display means in the form of text labels that are a bit farther removed from the data points. I tried something that doesn't work and would appreciate if someone can help me figure out how I can get the desired result. Thanks.

library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 3.4.3
library(ggrepel)
#> Warning: package 'ggrepel' was built under R version 3.4.2


# function to plot mean
fun_mean <- function(x) {
  return(data.frame(
    y = as.numeric(as.character(mean(x, na.rm = TRUE))),
    label = as.numeric(as.character(mean(x, na.rm = TRUE)))
  ))
}

# preparing the basic plot
plot <-
  ggplot2::ggplot(data = iris,
                  mapping = aes(x = Species, y = Sepal.Length)) +
  geom_point(
    position = position_jitterdodge(
      jitter.width = NULL,
      jitter.height = 0.2,
      dodge.width = 0.75
    ),
    alpha = 0.5,
    size = 3,
    aes(color = factor(Species))
  ) +
  geom_violin(width = 0.5,
              alpha = 0.2,
              fill = "white") +
  geom_boxplot(
    width = 0.3,
    alpha = 0.2,
    fill = "white",
    outlier.colour = "black",
    outlier.shape = 16,
    outlier.size = 3,
    outlier.alpha = 0.7,
    position = position_dodge(width = NULL)
  ) +
  theme(legend.position = "none")


# add the mean label to the plot
plot <- plot +
  stat_summary(
    fun.y = mean,
    geom = "point",
    colour = "darkred",
    size = 5
  ) +
  stat_summary(
    fun.data = fun_mean,
    geom = "text",
    vjust = -1.0,
    size = 5
  )

# see the plot
plot

# adding geom_label_repel
plot <-
  plot +
  ggrepel::geom_label_repel(
    mapping = aes(label = mean),
    fontface = 'bold',
    color = 'black',
    inherit.aes = FALSE,
    max.iter = 3e2,
    box.padding = 0.35,
    point.padding = 0.5,
    segment.color = 'grey50',
    force = 2
  )

plot # doesn't work :(
#> Don't know how to automatically pick scale for object of type function. Defaulting to continuous.
#> Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 0, 150

Created on 2018-02-10 by the reprex package (v0.1.1.9000).


Solution

  • The error lies in the line mapping = aes(label = mean) within ggrepel::geom_label_repel().

    You could try the following: first create a dataset that contains the mean values for every Species of variable Sepal.Length.

    mean_dat <- aggregate(data = iris[, c(1, 5)], . ~Species, FUN = mean)
    names(mean_dat) <- c("Species", "mean_label") # it is not necessary to rename the columns but it might avoid confusion
    mean_dat
    #     Species mean_label
    #1     setosa      5.006
    #2 versicolor      5.936
    #3  virginica      6.588
    

    We will use the column mean_label from this dataset as label argument in geom_label_repel(..., mapping = aes(label = mean_label)), hence we need to pass mean_dat to geom_label_repel as data argument.

    plot +
     stat_summary(
      fun.y = mean,
       geom = "point",
       colour = "darkred",
       size = 5
     ) +
     ggrepel::geom_label_repel(
      data = mean_dat,
      mapping = aes(label = mean_label),
      fontface = 'bold',
      color = 'black',
     #inherit.aes = FALSE, #would result in "error: geom_label_repel requires the following missing aesthetics: x, y"
      max.iter = 3e2,
      box.padding = 0.35,
      point.padding = 0.5,
      segment.color = 'grey50',
      force = 2
     )
    

    enter image description here