Search code examples
rdata.tableweighted-average

Weighted average each year - R


I am looking to calculate the average of the interest_rate for each year (2006 to 2023). I have a series of interest rate decision dates for the new rate and the date.

For years with multiple interest_rates during a year I would like to weight the average value by the proportion of year that at that particular rate.

Example data:

library(data.table)
df = data.table(date = c("03/08/2006", "09/11/2006", "11/01/2007", "10/05/2007", "05/07/2007", "06/12/2007", "07/02/2008", "10/04/2008", "08/10/2008", "06/11/2008", "04/12/2008", "08/01/2009", "05/02/2009", "05/03/2009", "04/08/2016", "02/11/2017", "02/08/2018", "11/03/2020", "19/03/2020", "16/12/2021", "03/02/2022", "17/03/2022", "05/05/2022", "16/06/2022", "04/08/2022", "22/09/2022", "03/11/2022", "15/12/2022", "02/02/2023", "23/03/2023", "11/05/2023"),
                interest_rate = c(4.7500, 5.0000, 5.2500, 5.5000, 5.7500, 5.5000, 5.2500, 5.0000, 4.5000, 3.0000, 2.0000, 1.5000, 1.0000, 0.5000, 0.2500, 0.5000, 0.7500, 0.2500, 0.1000, 0.2500, 0.5000, 0.7500, 1.0000, 1.2500, 1.7500, 2.2500, 3.0000, 3.5000, 4.0000, 4.2500, 4.500))

The resultant data would look like:

year weighted_interest_rate
2006 4.7
2007 ...
2008 etc

Solution

  • One method would be to create a data frame of the interest rates for each day in the entire series, group by year, then average the interest rate for each date in the year.

    An important first step is converting the dates-as-strings to actual dates.

    library(tidyverse)
    
    df %>%
      mutate(date = as.Date(date, '%d/%m/%Y'),
             days = c(as.numeric(diff(date)), 0)) %>%
      rowwise() %>%
      reframe(date = seq(date, by = 'day', length.out = days),
              interest_rate = interest_rate,
              year = lubridate::year(date)) %>%
      group_by(year) %>%
      summarize(interest_rate = mean(interest_rate))
    #> # A tibble: 18 x 2
    #>     year interest_rate
    #>    <dbl>         <dbl>
    #>  1  2006         4.84 
    #>  2  2007         5.51 
    #>  3  2008         4.67 
    #>  4  2009         0.644
    #>  5  2010         0.5  
    #>  6  2011         0.5  
    #>  7  2012         0.5  
    #>  8  2013         0.5  
    #>  9  2014         0.5  
    #> 10  2015         0.5  
    #> 11  2016         0.398
    #> 12  2017         0.291
    #> 13  2018         0.604
    #> 14  2019         0.75 
    #> 15  2020         0.228
    #> 16  2021         0.107
    #> 17  2022         1.47 
    #> 18  2023         3.97