Search code examples
rdategroup-by

Row-wise summing for each month in a dataframe


I have the following (simplified) dataframe which contains a date column:

Date = c("2011-01-01", "2011-02-01", "2012-01-01", "2012-02-01", "2013-01-01", "2013-02-01")
Values = c(1.123389e+02, 7.975111e+01, 7.782333e+01, 4.643222e+01, 2.625444e+01, 8.565556e+00)
df <- data.frame(Date, Values)
df <- df %>% dplyr::mutate(Date = as.Date(as.character(Date), format = "%Y-%m-%d"));

which is:

Date,       Values
2011-01-01, 1.123389e+02
2011-02-01, 7.975111e+01
2012-01-01, 7.782333e+01
2012-02-01, 4.643222e+01
2013-01-01, 2.625444e+01
2013-02-01, 8.565556e+00

What I would like to do, is to average the values that correspond to the same month to get the following:

Date, Sum
Jan, 216.4167
Feb, 134.7489

Solution

  • After coercing to a real date with as.Date, like in the question, keep only the month with format. Then aggregate the data.

    suppressPackageStartupMessages({
      library(dplyr)
    })
    
    df %>% 
      mutate(Date = as.Date(as.character(Date), format = "%Y-%m-%d"),
             Date = format(Date, "%b")) %>%
      summarise(Sum = sum(Values), .by = Date)
    #>   Date      Sum
    #> 1  jan 216.4167
    #> 2  fev 134.7489
    

    Created on 2023-11-18 with reprex v2.0.2