Search code examples
rggplot2bar-chartstandard-deviation

standard deviation in ggplo2 does not match data and looks strange in graph


I'm trying to do a graphical representation of the standard deviation of my data using ggplot2. I've already come here once for aid on another part of the code and some ppl very kindly helped me. However, I'm running into some issues. I wanted to put a smaller sample of my data, but I thought in this case something may be missed by that due to issues I'll explain down further. Please excuse the huge dput:

> dput(EGG)
structure(list(day = c("0", "0", "0", "0", "0", "0", "7", "7", 
"7", "7", "7", "7", "14", "14", "14", "14", "14", "14", "21", 
"21", "21", "21", "21", "21", "28", "28", "28", "28", "28", "28", 
"0", "0", "0", "0", "0", "0", "7", "7", "7", "7", "7", "7", "14", 
"14", "14", "14", "14", "14", "21", "21", "21", "21", "21", "21", 
"28", "28", "28", "28", "28", "28", "0", "0", "0", "0", "0", 
"0", "7", "7", "7", "7", "7", "7", "14", "14", "14", "14", "14", 
"14", "21", "21", "21", "21", "21", "21", "28", "28", "28", "28", 
"28", "28", "0", "0", "0", "0", "0", "0", "7", "7", "7", "7", 
"7", "7", "14", "14", "14", "14", "14", "14", "21", "21", "21", 
"21", "21", "21", "28", "28", "28", "28", "28", "28", "0", "0", 
"0", "0", "0", "0", "7", "7", "7", "7", "7", "7", "14", "14", 
"14", "14", "14", "14", "21", "21", "21", "21", "21", "21", "28", 
"28", "28", "28", "28", "28", "0", "0", "0", "0", "0", "0", "7", 
"7", "7", "7", "7", "7", "14", "14", "14", "14", "14", "14", 
"21", "21", "21", "21", "21", "21", "28", "28", "28", "28", "28", 
"28", "0", "0", "0", "0", "0", "0", "7", "7", "7", "7", "7", 
"7", "14", "14", "14", "14", "14", "14", "21", "21", "21", "21", 
"21", "21", "28", "28", "28", "28", "28", "28"), chemcon = c(15.5220395247868, 
9.06570359183137, 13.8392220116086, 10.0864401981599, 13.940373396987, 
14.5688, 14.4688, 13.7392220116086, 13.7504076433121, 10.3938523092218, 
17.7940351604959, 13.790373396987, 17.8440351604959, 13.9164076433121, 
8.91570359183137, 7.85248618100562, 18.1470351604959, 13.6892220116086, 
14.4188, 15.6613170443075, 15.3720395247868, 13.8762220116086, 
9.26870359183137, 8.15548618100562, 9.10270359183138, 15.5590395247868, 
12.6443936693897, 14.6058, 10.3438523092218, 13.6134076433121, 
12.7443936693897, 13.7134076433121, 7.95248618100562, 10.4938523092218, 
17.9440351604959, 15.7613170443075, 15.9166319769788, 16.0236161993658, 
14.449820053383, 23.6434916258293, 29.23, 16.6136319769788, 14.4337543105197, 
25.8571621909037, 27.2488509180735, 26.3405437141942, 26.6088509180735, 
12.3358365758755, 10.5308523092218, 15.7506319769788, 25.6435437141942, 
26.9373074696005, 21.880025633791, 15.9606007018431, 12.5943936693897, 
15.7983170443075, 25.2859002423082, 31.4473460721868, 22.683460279146, 
25.9829002423082, 11.82042, 15.5199002423082, 16.6636161993658, 
15.089820053383, 13.03551, 9.48875, 25.7405437141942, 18.6937124658781, 
30.7973460721868, 22.033460279146, 18.7437124658781, 25.6905437141942, 
25.7915437141942, 30.8473460721868, 22.083460279146, 18.8807124658781, 
25.3329002423082, 15.9636319769788, 25.6859002423082, 16.3166319769788, 
26.0435437141942, 22.220460279146, 25.5199002423082, 16.1506319769788, 
25.8775437141942, 19.0467124658781, 31.1503460721868, 22.386460279146, 
25.3829002423082, 16.0136319769788, 14.143373396987, 8.96570359183137, 
12.7813936693897, 13.5634076433121, 13.977373396987, 15.9643170443075, 
25.5199002423082, 30.5559941059036, 32.419, 34.0709071343396, 
36.9364807907246, 49.7582480724103, 39.5201, 55.8759773695839, 
45.219463429644, 41.3352176220807, 35.1199, 51.813050421884, 
41.6932176220807, 38.8525423728814, 40.1979941059036, 56.6606438665008, 
51.6796783998212, 44.861463429644, 36.6956947393253, 51.455050421884, 
56.2339773695839, 53.3216783998212, 64.124863506094, 37.0536947393253, 
5.293902130258, 17.55457, 9.98644019815995, 15.7250395247868, 
15.6113170443075, 13.840373396987, 20.9572380860577, 23.361216730038, 
25.6685910032803, 16.1506319769788, 25.6859002423082, 16.3166319769788, 
26.5473074696005, 21.490025633791, 15.5706007018431, 14.0437543105197, 
25.4671621909037, 17.9810351604959, 23.1734916258293, 38.7599759195535, 
16.1936161993658, 14.619820053383, 26.7788509180735, 12.5058365758755, 
18.8437124658781, 30.9473460721868, 22.183460279146, 25.4829002423082, 
16.1136319769788, 25.8405437141942, 18.7764641729212, 15.4965541256184, 
14.0422220116086, 7.98948618100562, 15.4220395247868, 10.1234401981599, 
18.7947124658781, 23.3114, 26.99513, 18.7447124658781, 30.8483460721868, 
22.084460279146, 24.0930028246135, 31.7642972700841, 20.0491670749633, 
27.7456849010516, 26.026400413152, 25.2619728008263, 26.0435437141942, 
22.220460279146, 25.1700273310577, 27.6462189543881, 24.8183371992736, 
32.1328682890728, 23.6425352458789, 28.3907888290054, 36.0970516777201, 
24.7983371992736, 25.7415437141942, 30.9843460721868, 9.93644019815994, 
7.80248618100562, 10.6968523092218, 10.2894401981599, 14.7718, 
12.9473936693897, 36.0500998193745, 27.962318404467, 50.8635290067137, 
33.3141610679696, 31.6036504707596, 44.4839329775636, 41.6817324840764, 
45.3996004974937, 44.5523364548183, 45.6550955414013, 50.2946957467359, 
41.0826507304484, 54.066863506094, 45.161463429644, 49.7002480724103, 
39.1525423728814, 45.4979941059036, 41.6352176220807, 51.6216783998212, 
46.9956947393253, 51.755050421884, 50.6026438665008, 53.1759773695839, 
52.3618577776276), type = c("control", "control", "control", 
"control", "control", "control", "control", "control", "control", 
"control", "control", "control", "control", "control", "control", 
"control", "control", "control", "control", "control", "control", 
"control", "control", "control", "control", "control", "control", 
"control", "control", "control", "nZn1", "nZn1", "nZn1", "nZn1", 
"nZn1", "nZn1", "nZn1", "nZn1", "nZn1", "nZn1", "nZn1", "nZn1", 
"nZn1", "nZn1", "nZn1", "nZn1", "nZn1", "nZn1", "nZn1", "nZn1", 
"nZn1", "nZn1", "nZn1", "nZn1", "nZn1", "nZn1", "nZn1", "nZn1", 
"nZn1", "nZn1", "nZn10", "nZn10", "nZn10", "nZn10", "nZn10", 
"nZn10", "nZn10", "nZn10", "nZn10", "nZn10", "nZn10", "nZn10", 
"nZn10", "nZn10", "nZn10", "nZn10", "nZn10", "nZn10", "nZn10", 
"nZn10", "nZn10", "nZn10", "nZn10", "nZn10", "nZn10", "nZn10", 
"nZn10", "nZn10", "nZn10", "nZn10", "nZn100", "nZn100", "nZn100", 
"nZn100", "nZn100", "nZn100", "nZn100", "nZn100", "nZn100", "nZn100", 
"nZn100", "nZn100", "nZn100", "nZn100", "nZn100", "nZn100", "nZn100", 
"nZn100", "nZn100", "nZn100", "nZn100", "nZn100", "nZn100", "nZn100", 
"nZn100", "nZn100", "nZn100", "nZn100", "nZn100", "nZn100", "Zn1", 
"Zn1", "Zn1", "Zn1", "Zn1", "Zn1", "Zn1", "Zn1", "Zn1", "Zn1", 
"Zn1", "Zn1", "Zn1", "Zn1", "Zn1", "Zn1", "Zn1", "Zn1", "Zn1", 
"Zn1", "Zn1", "Zn1", "Zn1", "Zn1", "Zn1", "Zn1", "Zn1", "Zn1", 
"Zn1", "Zn1", "Zn10", "Zn10", "Zn10", "Zn10", "Zn10", "Zn10", 
"Zn10", "Zn10", "Zn10", "Zn10", "Zn10", "Zn10", "Zn10", "Zn10", 
"Zn10", "Zn10", "Zn10", "Zn10", "Zn10", "Zn10", "Zn10", "Zn10", 
"Zn10", "Zn10", "Zn10", "Zn10", "Zn10", "Zn10", "Zn10", "Zn10", 
"Zn100", "Zn100", "Zn100", "Zn100", "Zn100", "Zn100", "Zn100", 
"Zn100", "Zn100", "Zn100", "Zn100", "Zn100", "Zn100", "Zn100", 
"Zn100", "Zn100", "Zn100", "Zn100", "Zn100", "Zn100", "Zn100", 
"Zn100", "Zn100", "Zn100", "Zn100", "Zn100", "Zn100", "Zn100", 
"Zn100", "Zn100")), row.names = c(NA, -210L), class = c("tbl_df", 
"tbl", "data.frame"))
head(EGG)
# A tibble: 6 x 3
  day   chemcon type   
  <chr>   <dbl> <chr>  
1 0       15.5  control
2 0        9.07 control
3 0       13.8  control
4 0       10.1  control
5 0       13.9  control
6 0       14.6  control

Here's my my code thus far:

library(ggplot2)
Figure1 <- ggplot(EGG,aes(x = type, y = chemcon, fill = day))
 
Figure1 +
   geom_bar(stat="identity", position= "dodge") + #nb you can just use 'dodge' in barplots
   scale_fill_brewer(palette="Paired")+
   theme_minimal() +
   labs(x="", y="chemcon") +
   theme(panel.background = element_blank(),
         axis.line = element_line(colour = "black"),
         panel.grid=element_blank()) +
   geom_errorbar(aes(ymin = chemcon - .5 * sd(chemcon),
                     ymax = chemcon + .5 * sd(chemcon)), 
                 position = "dodge")

This was the initial code i got thanks to someone in this website. It does not seem to work great when it comes to the StDev, unfortunately. first_output

Then I tried replacing geom_errorbarbit with following:

stat_summary(fun.data=mean_cl_boot, 
             geom="errorbar", 
             width=0.2, 
             position=position_dodge(width=0.90))

#####

  stat_summary(fun.data=mean_sdl, 
               geom="errorbar", 
               width=0.2, 
               position=position_dodge(width=0.90))

########

stat_summary(fun.data = mean_se, geom = 'errorbar', position = 'dodge')

and these are the results: 2 3 4

As you can see, nothing is quite what I want. I want specifically the standard deviation of each bar/section in the plot (for day 7 and type Zn100, what was the StDev of chemcon?, for example). These way too many or out of place.

(There was an issue with an excel image I provided, as the person who initially made it just told me to ignore it)


Solution

  • First compute a dataset (called cc in the code) with only means and SDs for each group/pair. In your case, 5*7=35 pairs. Then use this one for plotting your data. Personally I try to compute myself the data I want to display and not let the functions do it. It's imho less error prone.

    aa <- aggregate(chemcon ~ day + type, data=EGG, FUN=mean)
    bb <- aggregate(chemcon ~ day + type, data=EGG, FUN=sd)
    cc <- merge(aa, bb, by=c("day", "type"))
    colnames(cc)[3:4] <- c("mean", "sd")
    
    ggplot(cc, aes(x = type, y = mean, fill = day))+
      geom_bar(stat="identity", position= "dodge") + #nb you can just use 'dodge' in barplots
      scale_fill_brewer(palette="Paired")+
      theme_minimal() +
      labs(x="", y="chemcon") +
      theme(panel.background = element_blank(),
            axis.line = element_line(colour = "black"),
            panel.grid=element_blank()) +
      geom_errorbar(aes(ymin = mean-sd,
                        ymax = mean+sd), 
                    position = "dodge")