Search code examples
rggplot2confidence-interval

ggplot Stacked barplot with uncertainty


I would like to show the confidence interval for a stacked barplot. With whiskers the display is not quite correct (whiskers are always appended at the end, with 100% this makes no sense). While searching for alternative ways of plotting, I came across the following (see Image). Is something like this possible in ggplot2 f.e. with addins?

Thanks in advance.


Solution

  • Yes, it's possible, but requires a little bit of data manipulation, faking a discrete axis from a continuous one, and using geom_area instead of geom_bar

    In the absence of a reproducible example, here's one using the iris data set:

    library(tidyverse)
    
    iris %>%
      group_by(Species) %>%
      mutate(width = factor(ifelse(Sepal.Width > 3, 'wide', 'narrow'))) %>%
      nest(data = - Species) %>%
      mutate(data = map(data, ~ prop.test(table(.x$width))),
             prop = unlist(map(data, ~ .x$estimate)),
             lower = unlist(map(data, ~ .x$conf.int[1])),
             upper = unlist(map(data, ~ .x$conf.int[2]))) %>%
      select(-data) %>%
      summarise(prop = c(prop, 1 - prop), 
                lo = c(lower, 1-upper),
                hi = c(upper, 1 - lower),
                width = c('wide', 'narrow')) %>%
      group_by(Species, width) %>%
      summarize(x = as.numeric(Species), 
                x = c(x - 0.25, x - 0.25, x - 0.083, x + 0.083, x + 0.25, x + 0.25),
                y = c(0, prop, ifelse(width == 'wide', lo, hi),
                      ifelse(width == 'wide', hi, lo), prop, 0)) %>%
      ggplot(aes(x, y, fill = width)) + 
      geom_area(position = 'stack') +
      scale_x_continuous(breaks = 1:3, labels = levels(iris$Species), 
                         name = 'Species') +
      ylab('Proportion of measurements') +
      coord_flip() +
      scale_fill_brewer(palette = 'Set1') +
      theme_minimal(base_size = 16)
    

    enter image description here