Search code examples
rlubridatedata-wranglingmutate

Select quarterly months values


From the given dataframe i am trying to extract quarterly_values however it fails when i try to mutatue every third month values .

mont_dates <- c("201401", "201402", "201403", "201404", "201405", 
"201406", "201407", "201408", "201409", "201410", "201411", "201412")
cat <- c("new", "old", "undefined", "new", "old", "undefined", 
"undefined", "new", "old", "undefined", "new", "old")
mont_vals <- c(221, 433, 878, 455, 998, 797, 77, 3435, 868, 686, 868, 434)
ID <- c(1,2,3,4,5,6,7,8,9,10,11,12)
df <- data.frame(mont_dates, mont_vals, cat, ID)

Function to extract monthly values and calculate quarterly values

 
Monthly_Qrt_vals <- function(df1) {
  df1 %>%
 mutate(mont_dates = ymd(paste0(mont_dates, "01")),
           quarter = paste0(year(mont_dates), " Q", quarter(mont_dates))) %>%
    arrange(mont_dates) %>%
    mutate(quarterly_values = mont_vals[c(1,seq(3, nrow(mont_vals) , by=3)),])
}
result <- df %>% Monthly_Qrt_vals()
View(result)
         cat ID mont_dates mont_vals quarter
1        new  1 2014-01-01       221 2014 Q1
2        old  2 2014-02-01       433 2014 Q1
3  undefined  3 2014-03-01       878 2014 Q1
4        new  4 2014-04-01       455 2014 Q2
5        old  5 2014-05-01       998 2014 Q2
6  undefined  6 2014-06-01       797 2014 Q2
7  undefined  7 2014-07-01        77 2014 Q3
8        new  8 2014-08-01      3435 2014 Q3
9        old  9 2014-09-01       868 2014 Q3
10 undefined 10 2014-10-01       686 2014 Q4
11       new 11 2014-11-01       868 2014 Q4
12       old 12 2014-12-01       434 2014 Q4

Error

Error in `mutate()`:
ℹ In argument: `quarterly_values = mont_vals[c(1, seq(3, nrow(mont_vals), by = 3)), ]`.
Caused by error in `seq.default()`:
! 'to' must be of length 1

Expected Outcome
quarterly_values should take values of every third month assign to first row of quarters i.e. 2014 Q1, 2014 Q2.

The original dataset is of 10 years from 2014-2023 Is there an alternative solution to get quarterly_values ?

         cat ID mont_dates mont_vals quarter  quarterly_values 
1        new  1 2014-01-01       221 2014 Q1   878
2        old  2 2014-02-01       433 2014 Q1   NA
3  undefined  3 2014-03-01       878 2014 Q1   NA
4        new  4 2014-04-01       455 2014 Q2   797
5        old  5 2014-05-01       998 2014 Q2   NA
6  undefined  6 2014-06-01       797 2014 Q2   NA
7  undefined  7 2014-07-01        77 2014 Q3   77
8        new  8 2014-08-01      3435 2014 Q3   NA
9        old  9 2014-09-01       868 2014 Q3   NA
10 undefined 10 2014-10-01       686 2014 Q4   686
11       new 11 2014-11-01       868 2014 Q4   NA
12       old 12 2014-12-01       434 2014 Q4   NA

Solution

  • This groups by quarter, then generates quarterly_values by setting the first row per quarter equal to the final month's value in that quarter. It uses ifelse to decide whether to fill the row, and tail to choose the final value within each group (quarter).

    Note the use of tail assumes the entries are in date order. You could sort them first if necessary.

    df |> 
       mutate(mont_dates = ymd(paste0(mont_dates, "01")),
              quarter = paste0(year(mont_dates), " Q", quarter(mont_dates))) |>
       group_by(quarter) |>
       mutate(quarterly_values=ifelse(row_number()==1 , tail(mont_vals,1) , NA))
    
    
    # A tibble: 12 × 6
    # Groups:   quarter [4]
       mont_dates mont_vals cat          ID quarter quarterly_values
       <date>         <dbl> <chr>     <dbl> <chr>              <dbl>
     1 2014-01-01       221 new           1 2014 Q1              878
     2 2014-02-01       433 old           2 2014 Q1               NA
     3 2014-03-01       878 undefined     3 2014 Q1               NA
     4 2014-04-01       455 new           4 2014 Q2              797
     5 2014-05-01       998 old           5 2014 Q2               NA
     6 2014-06-01       797 undefined     6 2014 Q2               NA
     7 2014-07-01        77 undefined     7 2014 Q3              868
     8 2014-08-01      3435 new           8 2014 Q3               NA
     9 2014-09-01       868 old           9 2014 Q3               NA
    10 2014-10-01       686 undefined    10 2014 Q4              434
    11 2014-11-01       868 new          11 2014 Q4               NA
    12 2014-12-01       434 old          12 2014 Q4               NA