Search code examples
rline-plot

Line plot with error bars in which each line is a different group and multiple variables are in the x axis


I'm trying to create a line plot with error bars in R/Rstudio, in which each line is a different group (coded by one variable) and different continuous variables compose the x axis. Taking the dataset diamonds as examples, it would be a multiple line graph, in which each line is one category of the variable "color and x,y,z are variables in whose levels are in the y axis, but they are positioned in the x axis. the head of diamonds in R is: (as coded in R studio :

>head(diamonds)

carat cut       color clarity depth table price     x     y     z
  <dbl> <ord>     <ord> <ord>   <dbl> <dbl> <int> <dbl> <dbl> <dbl>
1 0.23  Ideal     E     SI2      61.5    55   326  3.95  3.98  2.43
2 0.21  Premium   E     SI1      59.8    61   326  3.89  3.84  2.31
3 0.23  Good      E     VS1      56.9    65   327  4.05  4.07  2.31
4 0.290 Premium   I     VS2      62.4    58   334  4.2   4.23  2.63
5 0.31  Good      J     SI2      63.3    58   335  4.34  4.35  2.75
6 0.24  Very Good J     VVS2     62.8    57   336  3.94  3.96  2.48

an example of a similar graph would be the one attached in the pic, but I need one with error bars (and this was made in stata, which just can't add error bars to this command which is: profileplot varx vary varz, by(groups)

profile plot without errorbars as an example is here::


Solution

  • Before we start, we will plot x,y,z columns from diamonds,and because x and y and very close, i subtract 1 from y so we can see it, and also introduce some error for error bars

    library(tidyr)
    library(ggplot2)
    library(dplyr)
    mydata <- diamonds %>% select(color,x,y,z) %>% pivot_longer(-color)
    
            # A tibble: 6 x 3
      color name  value
      <ord> <chr> <dbl>
    1 E     x      1.80
    2 E     y      3.98
    3 E     z      2.43
    4 E     x      2.92
    5 E     y      3.84
    6 E     z      2.31
    

    Then:

    ggplot(mydata,aes(x=name,y=value,color=color)) + 
    stat_summary(fun.y=mean,geom="point") +
    stat_summary(fun.y=mean,aes(group=color),geom="line") +
    stat_summary(fun.data=mean_se,geom="errorbar",width=0.1)
    

    enter image description here

    In this case the errorbars etc don't make sense because the x, y and z values are pretty much similar.