Search code examples
rggplot2line-plot

How to make a lineplot with specific values out of a dataframe


I have a df as follow:

Variable        Value
G1_temp_0       37.9
G1_temp_5       37.95333333
G1_temp_10      37.98333333
G1_temp_15      38.18666667
G1_temp_20      38.30526316
G1_temp_25      38.33529412
G1_mean_Q1      38.03666667
G1_mean_Q2      38.08666667
G1_mean_Q3      38.01
G1_mean_Q4      38.2
G2_temp_0       37.9
G2_temp_5       37.95333333
G2_temp_10      37.98333333
G2_temp_15      38.18666667
G2_temp_20      38.30526316
G2_temp_25      38.33529412
G2_mean_Q1      38.53666667
G2_mean_Q2      38.68666667
G2_mean_Q3      38.61
G2_mean_Q4      38.71

I like to make a lineplot with two lines which reflects the values "G1_mean_Q1 - G1_mean_Q4" and "G2_mean_Q1 - G2_mean_Q4"

In the end it should more or less look like this, the x axis should represent the different variables: enter image description here

The main problem I have is, how to get a basic line plot with this df. I've tried something like this,

ggplot(df, aes(x = c(1:4), y = Value) + geom_line()

but I have always some errors. It would be great if someone could help me. Thanks


Solution

  • Please post your data with dput(data) next time. it makes it easier to read your data into R.

    You need to tell ggplot which are the groups. You can do this with aes(group = Sample). For this purpose, you need to restructure your dataframe a bit and separate the Variable into different columns.

    library(tidyverse)
    dat <- structure(list(Variable = structure(c(5L, 10L, 6L, 7L, 8L, 9L, 
                                                 1L, 2L, 3L, 4L, 15L, 20L, 16L, 17L, 18L, 19L, 11L, 12L, 13L, 
                                                 14L), .Label = c("G1_mean_Q1", "G1_mean_Q2", "G1_mean_Q3", "G1_mean_Q4", 
                                                                  "G1_temp_0", "G1_temp_10", "G1_temp_15", "G1_temp_20", "G1_temp_25", 
                                                                  "G1_temp_5", "G2_mean_Q1", "G2_mean_Q2", "G2_mean_Q3", "G2_mean_Q4", 
                                                                  "G2_temp_0", "G2_temp_10", "G2_temp_15", "G2_temp_20", "G2_temp_25", 
                                                                  "G2_temp_5"), class = "factor"), Value = c(37.9, 37.95333333, 
                                                                                                             37.98333333, 38.18666667, 38.30526316, 38.33529412, 38.03666667, 
                                                                                                             38.08666667, 38.01, 38.2, 37.9, 37.95333333, 37.98333333, 38.18666667, 
                                                                                                             38.30526316, 38.33529412, 38.53666667, 38.68666667, 38.61, 38.71
                                                                  )), class = "data.frame", row.names = c(NA, -20L))
    
    
    dat <- dat %>% 
      filter(str_detect(Variable, "mean")) %>% 
      separate(Variable, into = c("Sample", "mean", "time"), sep = "_")
    
    
    g <- ggplot(data=dat, aes(x=time, y=Value, group=Sample)) +
      geom_line(aes(colour=Sample))
    g
    

    Created on 2020-07-20 by the reprex package (v0.3.0)