Search code examples
rggplot2facetmedian

Add hline with population median for each facet


I'd like to plot a horizontal facet-wide line with the population median of that facet.

I tried the approach without creating a dummy summary table with the following code:

require(ggplot2)

dt = data.frame(gr = rep(1:2, each = 500),
            id = rep(1:5, 2, each = 100), 
            y = c(rnorm(500, mean = 0, sd = 1), rnorm(500, mean = 1, sd = 2)))


ggplot(dt, aes(x = as.factor(id), y = y)) +
  geom_boxplot() +
  facet_wrap(~ gr) +
  geom_hline(aes(yintercept = median(y), group = gr), colour = 'red')

However, the line is drawn for the median of the entire dataset instead of the median separately for each facet: enter image description here

In the past, a solution has been suggested to use

  geom_line(stat = "hline", yintercept = "median")

but it's been discontinued (produces the error "No stat called StatHline").

Another solution suggested

 geom_errorbar(aes(ymax=..y.., ymin=..y.., y = mean))

but it generates

Error in data.frame(y = function (x, ...)  : 
arguments imply differing number of rows: 0, 1000

Finally, there's a way to plot the median by creating a dummy table with the desired stats but I'd like to avoid it.


Solution

  • If you don't want to add a new column with the computed median, you can add a geom_smooth using a quantile regression :

    library(ggplot2)
    library(quantreg)
    
    set.seed(1234)
    
    dt <- data.frame(gr = rep(1:2, each = 500),
                    id = rep(1:5, 2, each = 100), 
                    y = c(rnorm(500, mean = 0, sd = 1),
                          rnorm(500, mean = 1, sd = 2)))
    
    ggplot(dt, aes(y = y)) +
      geom_boxplot(aes(x = as.factor(id))) +
      geom_smooth(aes(x = id), method = "rq", formula = y ~ 1, se = FALSE) +
      facet_wrap(~ gr)
    

    boxplot and median