Search code examples
rggplot2meanstandard-deviation

Find mean and sd in Iris data set and draw graph


I need to find for every numeric variable in Iris data set mean and standard deviation by Species and draw that in ggplot2 graph with geom_col and geom_errorbar.

This is what I got so far

library(tidyverse)
data(Iris)
iris %>% 
  group_by(Species) %>% 
    summarise_if(is.numeric, list(mean = mean, sd = sd)) -> IrisData

I tried to create a graph but I don't know how to use the geom_errorbar

IrisData %>%
  select(Species, ends_with("mean")) %>%
  gather(key, val, 2:5) %>%
  ggplot(aes(key, val, fill = Species)) +
  geom_col()

I found that it should look something like this

geom_errorbar(aes(ymin = mean - sd, ymax = mean + sd), width=0.2)

But I'm not sure how to use it, I added this to the end of code and I get some graph but I'm sure it's not right

geom_errorbar(aes(ymin = val - sd(val), ymax = val + sd(val)), width=0.2, size = 1.2) 

Solution

  • ggplot does not allow stacking of error bars by default. So, you will have to do that by hand error bar with stacked barplot which is not that good. If you want to implement it you can follow this, else you can use something like

    library(tidyverse)
    data(iris)
    
    iris %>% 
      group_by(Species) %>% 
      summarise_if(is.numeric, list(mean = mean, sd = sd)) -> IrisData
    
    iris %>% 
      pivot_longer(-Species) %>% 
      group_by(Species, name) %>% 
      summarise(Mean = mean(value),  
                SD = sd(value)) -> IrisData
    
    IrisData %>%
      ggplot(aes(name, Mean, fill = Species)) +
      geom_bar(stat = "identity", position = "dodge")+
      geom_errorbar(aes(ymin = Mean - SD, ymax = Mean + SD), width=0.2, position = position_dodge(.9))
    

    enter image description here

    or

    library(ggpubr)
    iris %>% 
      pivot_longer(-Species) %>% 
      ggbarplot(x = "name", y = "value", add = "mean_sd",
      color = "Species")
    

    enter image description here