Search code examples
rggplot2

ggplot : Adding mean/error bars for dotplot with different groups


I'm new to ggplot and starting from this graph :

library(ggplot2)
library(reshape2)

data <- read.delim(textConnection("
Sample Day_0 Day_1 Day_4 Day_5 Day_7
NM 1000 221000 6620000 17200000 43700000
OG 1000 351000 1750000 6880000 18300000
OD 1000 961000 1090000 6380000 4400000
ODD 1000 1060000 3550000 12000000 13100000"), sep = " ")

data_melt <- melt(data, id.var = "Sample")
data_melt$value <- as.numeric(data_melt$value)

ggplot(data=data_melt, aes(x=variable, y=value, color = Sample)) + geom_point(size = 2.5) + scale_y_continuous(trans=log2_trans(), breaks = trans_breaks("log10", function(x) 10^x), labels = trans_format("log10", math_format(10^.x))) + 
  ggtitle("My_Title") + xlab("My_X") + ylab("My_Axis") + theme(plot.title = element_text(hjust = 0.5)) + expand_limits(y = c(10^3, 10^8))

see the graph result

What I would like to do is to add mean and error bars of the 4 points each of the "Days" (in this kind of way for example, picture from http://www.sthda.com/).

Any method/advice would be helpful !


Solution

  • You could do this using geom_errorbarand adding the relevant statistics when defining your data set. For the length of the error bars the code below simply uses the the 0.25/0.75 empirical quantiles. If you want to change that, just change lower and upper to the ranges you are interested in.

    library(dplyr)    
    data_melt <- data_melt %>% group_by(variable) %>% mutate(upper =  quantile(value, 0.75), 
                                                         lower = quantile(value, 0.25),
                                                         mean = mean(value))
    
    # How the first 9 values of your data set should look like now: 
    #A tibble: 20 x 6
    ## Groups: variable [5]
    #Sample variable    value    upper    lower     mean
    #<fctr> <fctr>      <dbl>    <dbl>    <dbl>    <dbl>
    #1 NM     Day_0        1000     1000     1000     1000
    #2 OG     Day_0        1000     1000     1000     1000
    #3 OD     Day_0        1000     1000     1000     1000
    #4 ODD    Day_0        1000     1000     1000     1000
    #5 NM     Day_1      221000   985750   318500   648250
    #6 OG     Day_1      351000   985750   318500   648250
    #7 OD     Day_1      961000   985750   318500   648250
    #8 ODD    Day_1     1060000   985750   318500   648250
    #9 NM     Day_4     6620000  4317500  1585000  3252500
    
    ggplot(data=data_melt, aes(x=variable, y=value, color = Sample)) + 
          geom_point(size = 2.5) + scale_y_continuous(trans=log2_trans(), 
                                              breaks = trans_breaks("log10", 
                                                       function(x) 10^x), 
                                              labels = trans_format("log10", 
                                                        math_format(10^.x))) + 
          ggtitle("My_Title") + 
          xlab("My_X") + ylab("My_Axis") + 
          theme(plot.title = element_text(hjust = 0.5)) + 
          expand_limits(y = c(10^3, 10^8)) + 
          geom_errorbar(aes(ymin = lower, ymax = upper),col = "red", 
                       width =  0.25) +
          geom_point(aes(x = variable, y = mean), size = 3, col = "red")
    

    enter image description here