Search code examples
rdplyrcasemutate

Using a custom case_when function in dplyr mutate


I've review many posts related to my question, but I cannot seem to figure out my issue.

I have a basic table that will have additional columns as collection (the NFL season) continues. I cannot get my function using a case_when based on the week to run without returning "! object 'WK1' not found"

My data after the second week.

NumCorrect <- data.frame(
  TEAM = c(stringr::str_c("team_", 1:5)),
  WK1 = c(11,12,11,12,13),
  WK2 = c(7,7,9,10,7)
)

My code.

# Defined variable at the top of the code that I change as the weeks are played.
curWEEK = 2

# Simplified version of my dplyr code at the step that won't work. 
correct_d <- mutate(AVG_16 = average16(curWEEK))

My function.

average16 <- function(x) {case_when(x == 1 ~ WK1,
                                    x == 2 ~ round(mean((WK1:WK2), na.rm=TRUE),1),
                                    x == 3 ~ round(mean((WK1:WK3), na.rm=TRUE),1),
                                    x %in% 4:7 ~ round(mean((WK1:WK4), na.rm=TRUE),1),
                                    x %in% 8:11 ~ round(mean(c(WK1:WK4,WK8), na.rm=TRUE),1),
                                    x %in% 12:14 ~ round(mean(c(WK1:WK4,WK8,WK12), na.rm=TRUE),1),
                                    x == 15 ~ round(mean(c(WK1:WK4,WK8,WK12,WK15), na.rm=TRUE),1),
                                    x == 16 ~ round(mean(c(WK1:WK4,WK8,WK12,WK15:WK16), na.rm=TRUE),1),
                                    x == 17 ~ round(mean(c(WK1:WK4,WK8,WK12,WK15:WK17), na.rm=TRUE),1),
                                    x == 18 ~ round(mean(c(WK1:WK4,WK8,WK12,WK15:WK18), na.rm=TRUE),1)
                                    )}

I tried making R and chatGPT fight it out but the function just got more and more convoluted without a solution.

How can I use a function to find the average of only certain columns in a table that isn't complete and will have columns added to it?

I have tried many versions and am able to modify the code to leave all week columns (WK1 to WK18) with unplayed weeks as NA, but still received the 'Object not found' error.

Error everytime:

Error in `mutate()`:
ℹ In argument: `AVG_16
  = average16(curWEEK)`.
ℹ In row 1.
Caused by error in `case_when()`:
! Failed to evaluate
  the right-hand side of
  formula 1.
Caused by error:
! object 'WK1' not found

I also attempted this:

average16 <- function(x) {if(x == 1) {WK1}
                          if(x == 2) {round(mean((WK1:WK2), na.rm=TRUE),1)}
                          if(x == 3) {round(mean((WK1:WK3), na.rm=TRUE),1)}
                          if(x %in% 4:7)   {round(mean((WK1:WK4), na.rm=TRUE),1)}
                          if(x %in% 8:11)  {round(mean(c(WK1:WK4,WK8), na.rm=TRUE),1)}
                          if(x %in% 12:14) {round(mean(c(WK1:WK4,WK8,WK12), na.rm=TRUE),1)}
                          if(x == 15) {round(mean(c(WK1:WK4,WK8,WK12,WK15), na.rm=TRUE),1)}
                          if(x == 16) {round(mean(c(WK1:WK4,WK8,WK12,WK15:WK16), na.rm=TRUE),1)}
                          if(x == 17) {round(mean(c(WK1:WK4,WK8,WK12,WK15:WK17), na.rm=TRUE),1)}
                          if(x == 18) {round(mean(c(WK1:WK4,WK8,WK12,WK15:WK18), na.rm=TRUE),1)}
                          }

Solution

  • If you're using dplyr, then maybe a very different method might achieve your goal.

    You can use a character vector to specify the columns that you want to take the mean of, if they exist. If they don't exist, then any_of will silently ignore them.

    library(tidyverse)
    NumCorrect <- data.frame(
        TEAM = LETTERS[1:5],
        WK1 = c(11,12,11,12,13),
        WK2 = c(7,7,9,10,7)
    )
    
    cols_to_avg <- paste0("WK", c(1:4, 8, 12, 15:18))
    
    NumCorrect |> 
        rowwise() |> 
        mutate(avg16 = round(mean(c_across(any_of(cols_to_avg))),1)) |> 
        ungroup()