Search code examples
ggplot2data-visualizationgeom-bar

Is there a way to show that the mean of a group is at zero?


I am trying to make a quick graph to show the means of a few groups. My y axis ranges from -2 to +2. Two of my groups have a mean of zero, therefore nothing shows up on the graph. Is there a good way to depict the two means at zero? Zero is a meaningful number. I was thinking like a small pink bar or blue bar.

Any suggestions? Sorry in advance for the messy code!

conseq_summary_table=structure(list(age_group = c("4", "4", "5", "5", "adult", "adult"), 
condition_motive = c("bad", "good", "bad", "good", "bad", "good"), 
group_conseq_mean = c(0, 0.192307692307, 0.133333333333, 
   -0.0333333333333, -0.710526315789, 0), 
conseq_sd = c(0.577350269189, 0.722797272709, 
   0.549891764241,0.611399643285, 0.450795268685, 0)), 
row.names = c(NA, -6L), 
class = c("grouped_df", "tbl_df", "tbl", "data.frame"), 
groups = structure(list(age_group = c("4", "5", "adult"), 
 .rows = list(1:2, 3:4, 5:6)), row.names = c(NA, -3L), 
 class = c("tbl_df", "tbl", "data.frame"), .drop = TRUE))

Here is my current code:

    ggplot(conseq_summary_table, aes(age_group, conseq_mean )) +
      geom_bar(aes(fill = condition_motive), stat = "identity", position = "dodge",
               alpha = .7) +
      labs(title = "Summary of conseq", x = "Age Group", y = "Average conseq" ) +
        theme_minimal() +
      scale_y_continuous(expand = c(0,0),
                         limits = c(-2,2)) +
      geom_hline(yintercept=0)

Bar plot


Solution

  • I know it's not really my remit, but may I allow myself some comments on your plot?

    • For what you want to show, bar plots are not appropriate. I think one of the reasons you were struggling with this visualisation is just because of that. If you want to show mean and standard deviation, a much more appropriate visualisation are dots with error bars.
    • Whenever possible, calculate summary statistics within ggplot2! There is often no need of pre-processing your data. Now there are obviously several reasons why you may not have this data. Sometimes one can't get around pre-processing. But if you should have access to the raw data, your case would be a perfect example for the use of summary stats such as above mentioned or box plots.
    • Think about using facets! Separating your data visually helps the reader a lot to identify patterns and to "see the story". Your task would be to find the best facet for visualisation!
    • Don't use axis limits that do not make sense! In your case, stick to 1 rather than 2. As a matter of fact, ggplot often choses very reasonable limits - so you may not need to specify limits at all! And if you limit your axes, best to do this within coord_...() calls
    • The default colors are ggplot2's one big flaw However, there is an integrated set of color palettes based on https://colorbrewer2.org- you can chose color blind friendly palettes etc.

    Below a suggestion how I would plot this based on the above comments:

    library(ggplot2)
    
    conseq_summary_table <- structure(list(age_group = c("4", "4", "5", "5", "adult", "adult"), condition_motive = c("bad", "good", "bad", "good", "bad", "good"), group_conseq_mean = c(0, 0.192307692307, 0.133333333333, -0.0333333333333, -0.710526315789, 0), conseq_sd = c(0.577350269189, 0.722797272709, 0.549891764241,0.611399643285, 0.450795268685, 0)), row.names = c(NA, -6L), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), groups = structure(list(age_group = c("4", "5", "adult"), .rows = list(1:2, 3:4, 5:6)), row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame"), .drop = TRUE))
    
    ggplot(conseq_summary_table, aes(age_group, group_conseq_mean )) +
      geom_hline(yintercept=0) +
      geom_pointrange(aes(ymin= group_conseq_mean - conseq_sd, 
                          ymax = group_conseq_mean + conseq_sd, color = condition_motive),
                      position = position_dodge(width = 1)) +
      scale_color_brewer(palette = 'Set1') +
      scale_y_continuous() +
      facet_wrap(~ age_group, scales = 'free_x') # maybe try facetting by condition_motive instead
    

    Created on 2020-03-24 by the reprex package (v0.3.0)