Search code examples
rerror-handlingbar-chartstandard-deviationerrorbar

How do I incorporate SE in place of SD in my bar chart error bars? Also, how do I change the order of my x-axis groups


I have created a bar chart displaying proportion of time spent on different behaviours for groups of lemurs. However I am placed with two problems.

1) I had hoped to use standard error bars in place of my standard deviation bars. I am unsure in how to incorporate it into my existing code. My current ggplot output is as follows:

 data_summary <- function(data, varname, groupnames){
  require(plyr)
  summary_func <- function(x, col){
    c(mean = mean(x[[col]], na.rm=TRUE),
      sd = sd(x[[col]], na.rm=TRUE),)
  }
  data_sum<-ddply(data, groupnames, .fun=summary_func,
                  varname)
  data_sum <- rename(data_sum, c("mean" = varname))
  return(data_sum)
}

df4 <- data_summary(mydata_bc, varname="Time", 
                    groupnames=c("Group", "Behaviour"))

p <- ggplot(df4, aes(x=Behaviour, y=Time, fill=Group)) + 
  geom_bar(stat="identity", position=position_dodge()) +
  geom_errorbar(aes(ymin=Time-sd, ymax=Time+sd), width=.2,
                position=position_dodge(0.9))

2) I also had hoped to change the order of my behaviours on the x axis.

Any help would be greatly appreciated.

Current bar chart Current bar chart

My csv data: https://drive.google.com/file/d/1UWJoluv3MWwXoQg2zcDORDJiWuIA8j4f/view?usp=sharing


Solution

  • I replaced:

    sd = sd(x[[col]], na.rm=TRUE)
    

    With:

    se = sd(x[[col]], na.rm=TRUE) / sqrt(sum(!is.na(x[[col]])))
    

    Which is the SD divided by the square root of the length.

    There was also an extra comma in your data_summary function.

    You can change the order of columns by reordering the factor.

    mydata_bc$Behaviour <- factor(mydata_bc$Behaviour, levels = c("Resting","Feeding","Socialising","Locomotion"))
    

    Then you can plot.

     data_summary <- function(data, varname, groupnames){
      require(plyr)
      summary_func <- function(x, col){
        c(mean = mean(x[[col]], na.rm=TRUE),
          se = sd(x[[col]], na.rm=TRUE) / sqrt(sum(!is.na(x[[col]]))))
      }
      data_sum<-ddply(data, groupnames, .fun=summary_func,
                      varname)
      data_sum <- rename(data_sum, c("mean" = varname))
      return(data_sum)
    }
    
    mydata_bc$Behaviour <- factor(mydata_bc$Behaviour, levels = c("Resting","Feeding","Socialising","Locomotion"))
    
    df4 <- data_summary(mydata_bc, varname="Time", 
                        groupnames=c("Group", "Behaviour"))
    
    p <- ggplot(df4, aes(x=Behaviour, y=Time, fill=Group)) + 
      geom_bar(stat="identity", position=position_dodge()) +
      geom_errorbar(aes(ymin=Time-se, ymax=Time+se), width=.2,
                    position=position_dodge(0.9))
    

    enter image description here