Search code examples
rggplot2plotmultiple-columns

plotting specific columns of a data frame in R


ind set inst_0 inst_1 inst_2 Inst_3 inst_4 inst_5
0 1 20 30 50 55 58 60
0 2 34 44 46 67 89 70
0 3 37 89 78 80 90 98
0 4 23 45 67 89 87 89
1 1 34 56 65 78 77 89
1 2 23 32 45 55 66 77
1 3 35 69 88 99 98 57
1 4 23 45 56 78 89 99
2 1 23 34 55 55 77 88
2 2 12 44 55 67 88 90
2 3 12 66 77 91 44 99
2 4 45 55 88 31 56 100

I have a data frame like this above and I would like to make a plot showing this kind of a trend like in the graph below( this is only made for 4 individual in a same set) for the combinations of for example Ind0-set1, Ind1-set1, Ind2-set2...,Ind0-set2,Ind1-set2 and second question is that how to plot multiple line graph separately for each set in one graph?

enter image description here

I am not sure to use ggplot2 or it can be done plot function too.


Solution

  • If you want to do this using ggplot2 then the first step would be to reshape your data to long or tidy format using e.g. tidyr::pivot_longer:

    library(tidyr)
    library(dplyr)
    library(ggplot2)
    
    # Reshape to long
    dat <- dat %>%
      # Convert all column names to lower case
      rename_with(tolower) %>%
      pivot_longer(-c(ind, set), names_to = "inst", values_to = "value", names_prefix = "inst_")
    

    After doing so you could create a plot showing all individuals for all sets by using facetting:

    ggplot(dat, aes(inst, value, color = factor(ind), group = ind)) +
      geom_line() +
      geom_point() +
      facet_wrap(~set)
    

    Or you could filter your data for your desired combinations to create a plot for e.g. just one set like so:

    dat_filtered <- dat[dat$set == 1, ]
    
    ggplot(dat_filtered, aes(inst, value, color = factor(ind), group = ind)) +
      geom_line() +
      geom_point()
    

    DATA

    dat <- data.frame(
      ind = c(0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
      set = c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L),
      inst_0 = c(20L, 34L, 37L, 23L, 34L, 23L, 35L, 23L, 23L, 12L, 12L, 45L),
      inst_1 = c(30L, 44L, 89L, 45L, 56L, 32L, 69L, 45L, 34L, 44L, 66L, 55L),
      inst_2 = c(50L, 46L, 78L, 67L, 65L, 45L, 88L, 56L, 55L, 55L, 77L, 88L),
      Inst_3 = c(55L, 67L, 80L, 89L, 78L, 55L, 99L, 78L, 55L, 67L, 91L, 31L),
      inst_4 = c(58L, 89L, 90L, 87L, 77L, 66L, 98L, 89L, 77L, 88L, 44L, 56L),
      inst_5 = c(60L, 70L, 98L, 89L, 89L, 77L, 57L, 99L, 88L, 90L, 99L, 100L)
    )