Search code examples
rggplot2geom-bar

Preserve location of missing columns in combined bar plot


A very commonly posted issue is to preserve the width of bars when there is a missing column in ggplot (e.g. 1, 2. What I need in addition to this fix is to also preserve the relative location of the missing bars (i.e. the gaps should be where the bar would have been).

ggplot(mtcars, aes(factor(cyl), fill = factor(gear))) +
  geom_bar(position = position_dodge(preserve = "single"))

Results - I can't post images yet

You'll note that in the group for 8 cylinders (far right) there is a missing column for green (gear=4). preserve="single" has corrected the width, but the blue bar (gear=5) has shifted left to fill the void.

How can I prevent this? I want there to be a gap where green would have been.

Thank you for any help.


Solution

  • We get the frequency count based on 'cyl', 'gear', expand the data with complete to get all the combinations while filling the count column 'n' with 0 (by default all the columns not mentioned in the complete gets NA where there is a missing combination) and then plot with ggplot

    library(dplyr)
    library(tidyr)
    library(ggplot2)
    mtcars %>%
       count(cyl, gear) %>%
       complete(cyl = unique(cyl), gear = unique(gear), 
             fill = list(n = 0)) %>%
       ggplot(aes(factor(cyl), n, fill = factor(gear))) + 
            geom_bar(stat = 'identity', position = 'dodge')