Search code examples
rggplot2data-visualizationboxplotalpha-transparency

ggplot2 in R: why is the box plot fill colour very pale?


I've created box plots using ggplot() in R, using instructions online but the fill colour is really transparent and not showing up well compared to what it looks like on other people's plots. How can I solve this?

enter image description here


Solution

  • # Load ggplot and the mpg dataset
    
    library(ggplot2)
    
    data(mpg)
    head(mpg)
    
    # A tibble: 6 x 11
      manufacturer model displ  year   cyl trans      drv     cty   hwy fl    class  
      <chr>        <chr> <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr>  
    1 audi         a4      1.8  1999     4 auto(l5)   f        18    29 p     compact
    2 audi         a4      1.8  1999     4 manual(m5) f        21    29 p     compact
    3 audi         a4      2    2008     4 manual(m6) f        20    31 p     compact
    4 audi         a4      2    2008     4 auto(av)   f        21    30 p     compact
    5 audi         a4      2.8  1999     6 auto(l5)   f        16    26 p     compact
    6 audi         a4      2.8  1999     6 manual(m5) f        18    26 p     compact
    

    Set a higher alpha = ... inside of the call to geom_boxplot(). In general, alpha = ... refers to the opacity of a particular geom; they range from 0 to 1. Let's produce two plots. Here is one using the mpg dataset. I set alpha = 0.9, producing rather opaque boxes:

    ggplot(data = mpg, aes(x = class, y = hwy)) + 
      geom_boxplot(fill = "blue", alpha = 0.9) +
      theme_minimal()
    

    dark

    In the next plot, we set alpha = 0.2, which consequently produces translucent boxes:

    ggplot(data = mpg, aes(x = class, y = hwy)) + 
      geom_boxplot(fill = "blue", alpha = 0.2) +
      theme_minimal()
    

    light

    In general, we usually adjust the transparency of a particular plot to down-weight less salient observations or to avoid instances of over-plotting. I would argue a lower transparency works well in this setting. It accentuates the contours of the boxes and draws attention to the median within each categorical group.