Search code examples
rlubridatedata-wranglingmutate

R Select() Can't subset columns that don't exist


From the given dataframe i am trying to get Cat,monthly_dates, monthly_val and then calculating quarter, quarterly_months, quarterly_values

mont_dates <- c("201401", "201402", "201403", "201404", "201405", 
"201406", "201407", "201408", "201409", "201410", "201411", "201412")
cat <- c("new", "old", "undefined", "new", "old", "undefined", "undefined", "new", "old", "undefined", "new", "old")
mont_vals <- c(221, 433, 878, 455, 998, 797, 77, 3435, 868, 686, 868, 434)
ID <- c(1,2,3,4,5,6,7,8,9,10,11,12)
df <- data.frame(mont_dates, mont_vals, cat, ID)
Monthly_Qrt_vals <- function(df1) {
  df1 %>%
    mutate(mont_dates = ymd(paste0(mont_dates, "01"))) %>%
    pivot_longer(cols = -c(ID, cat, mont_dates),
                 names_to = "monthly_dates",
                 values_to = "monthly_val") %>%
    mutate(quarter = paste0(year(mont_dates), " Q", quarter(mont_dates))) %>%
    arrange(mont_dates) %>%
    group_by(quarter) %>%
    summarise(quarterly_values = sum(monthly_val, na.rm = TRUE),
              quarterly_months = n()) %>%
    ungroup() %>%
    select(monthly_dates, monthly_val, quarter, quarterly_months, quarterly_values)
}

result <- df %>% Monthly_Qrt_vals()
View(result)

Error In the pivot_longer i am getting the error monthly_dates doesn't exist. What am i doing wrong in this code ?

Error in `select()`:
! Can't subset columns that don't exist.
✖ Column `monthly_dates` doesn't exist.

Expected Outcome

Produce Columns from dataframe

Cat monthly_dates monthly_val quarter quarterly_months quarterly_values


Solution

  • You can keep the monthly dates and values by creating the quarterly summaries with mutate.

    library(tidyverse)
    
    Monthly_Qrt_vals <- function(df1) {
      df1 %>%
        mutate(mont_dates = ymd(paste0(mont_dates, "01")),
               quarter = paste0(year(mont_dates), " Q", quarter(mont_dates))) %>%
        arrange(mont_dates) %>%
        mutate(quarterly_values = sum(mont_vals, na.rm = TRUE),
               quarterly_months = n(), .by = quarter) %>%
        select(cat, mont_dates, mont_vals, quarter, quarterly_months, quarterly_values)
    }
    
    df %>% Monthly_Qrt_vals()
    #>          cat mont_dates mont_vals quarter quarterly_months quarterly_values
    #> 1        new 2014-01-01       221 2014 Q1                3             1532
    #> 2        old 2014-02-01       433 2014 Q1                3             1532
    #> 3  undefined 2014-03-01       878 2014 Q1                3             1532
    #> 4        new 2014-04-01       455 2014 Q2                3             2250
    #> 5        old 2014-05-01       998 2014 Q2                3             2250
    #> 6  undefined 2014-06-01       797 2014 Q2                3             2250
    #> 7  undefined 2014-07-01        77 2014 Q3                3             4380
    #> 8        new 2014-08-01      3435 2014 Q3                3             4380
    #> 9        old 2014-09-01       868 2014 Q3                3             4380
    #> 10 undefined 2014-10-01       686 2014 Q4                3             1988
    #> 11       new 2014-11-01       868 2014 Q4                3             1988
    #> 12       old 2014-12-01       434 2014 Q4                3             1988