Search code examples
rdplyr

Using filter inside R function


I have cost dataset of a device's components by Year and by Quarter-Year. As some components change price by quarter while others by year. Since I need to calculate the total bill of material of the device by Quarter-Year, I need to standardize all Year dataset to be Quarter-Year.

Example of Year dataset. The data is in wide format.

df <- data.frame(Item = c("A", "B", "C"), 
                 Year2022 = c(2, 3, 8),
                 Year2023 = c(2, 4, 7.8),
                 Year2024 = c(3, 4, 7)) 

Usually this is what I do:

  1. define wanted QY
  2. convert Year data to long form
  3. repeat the year column 4times
  4. mutate the QY column by adding numbers at the front (1Q22, 2Q22 etc)
  5. filter the dataset to only show the wanted QY
#define QY that I wanted
QY_level <-  c("1Q22", "2Q22", "3Q22", "4Q22",
               "1Q23", "2Q23", "3Q23", "4Q23",
               "1Q24")

df_nofunc <- df %>%
  pivot_longer(2:4,
               names_to = "QY",
               values_to = "Values") %>%
  slice(rep(1:n(), each = 4L)) %>%
  mutate(QY = paste0(1:4, "Q", str_extract(QY, "[0-9]{2}$"))) %>%
  filter(QY %in% QY_level)

That works fine but I have learned that using function is a better way especially since I keep copying and pasting this same code across my project. I want to try to build this into function form but the filter did not work.

I know there are similar questions but I don't quite understand WHY it did not work and only understand that I need to convert the QY column to variable using sym() then unquote it inside filter using !! (bang bang). But it still did not work.

generate_QY <- function(x) {
  x %>%
    pivot_longer(2:4,
                 names_to = "QY",
                 values_to = "Values") %>%
    slice(rep(1:n(), each = 4L)) %>%
    mutate(QY = paste0(1:4, "Q", str_extract(QY, "[0-9]{2}$"))) %>% 
    filter(!!sym(QY) %in% QY_level)
  
  return(x)
}


df_withfunc <- generate_QY(df)

It still say object 'QY' not found. Why it doesn't work and how do I make it work?


Solution

  • You don’t need to change the code at all, beyond removing the assignment. The following works:

    generate_QY <- function (df) {
      df %>%
        pivot_longer(
          2 : 4,
          names_to = "QY",
          values_to = "Values"
        ) %>%
        slice(rep(seq_len(n()), each = 4L)) %>%
        mutate(QY = paste0(1 : 4, "Q", str_extract(QY, "[0-9]{2}$"))) %>%
        filter(QY %in% QY_level)
    }
    

    (Note that I did change something unrelated, since 1 : x is error-prone, and seq_len(x) should be used instead.)