Search code examples
rlubridate

With a date sequence, divide a number over the count of days in a quarter


A data frame and some variables:

library(tidyverse)
library(lubridate)

budget_2020_q4 <- 1000000
budget_2021_q1 <- 2000000
budget_2021_q2 <- 3000000
budget_2021_q3 <- 3000000
budget_2021_q4 <- 2000000


calendar <- data.frame(
  cohort = seq('2020-10-01' %>% ymd, '2021-12-31' %>% ymd, by = '1 days')) %>% 
  mutate(Quarter = quarter(cohort, with_year = T))

I now have a data frame showing dates and the quarter that those dates are in:

calendar %>% head
      cohort Quarter
1 2020-10-01  2020.4
2 2020-10-02  2020.4
3 2020-10-03  2020.4
4 2020-10-04  2020.4
5 2020-10-05  2020.4
6 2020-10-06  2020.4

I also know the frequency of each quarter:

calendar$Quarter %>% table
.
2020.4 2021.1 2021.2 2021.3 2021.4 
    92     90     91     92     92

I would like to mutate a new column 'daily_budget' that takes the budget for the quarter and divides it over the frequency of dates in that quarter.

Example, the budget for q4 2020 is 1000000 and there are 92 days in Q4 so the daily budget is 1000000/92 = 10869.57

Can I somehow integrate this calculation into my dplyr pipeline of operations after mutate(Quarter = quarter(cohort, with_year = T))?


Solution

  • First, let's put the budgets in a data frame:

    budgets <- c(budget_2020_q4 = 1000000,
      budget_2021_q1 = 2000000,
      budget_2021_q2 = 3000000,
      budget_2021_q3 = 3000000,
      budget_2021_q4 = 2000000) %>% 
      enframe(name = "Quarter", value = "budget") %>% 
      mutate(Quarter = as.numeric(str_replace(str_remove(Quarter, "budget_"), "_q", ".")))
    

    Then, it is a matter of counting (tidyverse's alternative to table) the number of rows per Quarter, joining the budget and dividing the two:

    calendar %>% 
      add_count(Quarter) %>% 
      left_join(budgets, by = "Quarter") %>% 
      mutate(budget_by_day = budget / n)
    

    Which gives

            cohort Quarter  n budget budget_by_day
    1   2020-10-01  2020.4 92  1e+06      10869.57
    2   2020-10-02  2020.4 92  1e+06      10869.57
    3   2020-10-03  2020.4 92  1e+06      10869.57
    4   2020-10-04  2020.4 92  1e+06      10869.57
    5   2020-10-05  2020.4 92  1e+06      10869.57
    6   2020-10-06  2020.4 92  1e+06      10869.57
    7   2020-10-07  2020.4 92  1e+06      10869.57
    8   2020-10-08  2020.4 92  1e+06      10869.57
    9   2020-10-09  2020.4 92  1e+06      10869.57
    10  2020-10-10  2020.4 92  1e+06      10869.57
    ...