Search code examples
rggplot2tidyverse

geom_line group by year in R


I am trying to create a time series plot of some data by grouping the year column. In the code below when I try to gather values by year, the function instead picks another column's header (tmean) as key, where as I want values under column year as key.

How can I fix this?

Sample data

Date = c("2010-01", "2010-02", "2010-03","2010-04", "2011-01", "2011-02", "2011-03", "2011-04")
year = c(2010, 2010, 2010, 2010, 2011, 2011, 2011, 2011)
month = c(01, 02, 03, 04, 01, 02, 03, 04)
tmean = c(15, 20, 30, 25, 18, 23, 33, 28)

Code

library(tidyverse)

df = data.frame(tmean. Date, year, month)

df = df %>% 
  select(Date, tmean, year) %>% 
  gather(key = "variable", value = "value", -Date)

ggplot(df, aes(x = Date, y = value)) + 
  geom_line(aes(color = variable, linetype = variable)

Desired Plot

enter image description here


Solution

  • If you want to generate your desired plot, you don't really need to reshape your dataframe.

    library(ggplot2)
    
    Date = c("2010-01", "2010-02", "2010-03","2010-04", "2011-01", "2011-02", "2011-03", "2011-04")
    year = c(2010, 2010, 2010, 2010, 2011, 2011, 2011, 2011)
    month = c(01, 02, 03, 04, 01, 02, 03, 04)
    tmean = c(15, 20, 30, 25, 18, 23, 33, 28)
    df = data.frame(tmean, Date, year, month)
    
    ggplot(df, aes(month, tmean, color = as.character(year))) + 
      geom_line() +
      labs(color = "Year")
    

    Created on 2022-04-02 by the reprex package (v2.0.1)