I would like to create several (30+) variables based on the following function where N, B, C, Q are each a list of variables within the df dataframe. I've included the code as well as the error from R I receive. An example of the data is provided where "..." represents the rest of the data besides which contain 0, 1, or missing.
record_id timeframe ...
95 2 0 0
94 2 1 NA
19 6 0 1
17 6 1 NA
18 6 2 NA
75 9 0 1
73 9 1 0
74 9 2 NA
lag_vars <- function(df, N, B, C, Q){
df <- df %>% group_by(record_id) %>%
mutate(N = case_when(
timeframe == 0 ~ B,
timeframe > 0 & C == 1 ~ Q,
timeframe > 0 & C == 0 ~ lag(N)
))
return(df)
}
lag_vars(t, Nt, Bt, Ct, Qt)
Which returns an error:
Error in `mutate()`: ! Problem while computing `N = case_when(...)`. ℹ The error occurred in group 1: record_id = 2. Caused by error in `case_when()`: ! `timeframe == 0 ~ B`, `timeframe > 0 & C == 1 ~ Q`, `timeframe > 0 & C == 0 ~ lag(N)` must be length 2 or one, not 3. Run `rlang::last_trace()` to see where the error occurred. Called from: signal_abort(cnd, .file) Warning messages: 1: Problem while computing `N = case_when(...)`. ℹ longer object length is not a multiple of shorter object length ℹ The warning occurred in group 1: record_id = 2. 2: Problem while computing `N = case_when(...)`. ℹ longer object length is not a multiple of shorter object length ℹ The warning occurred in group 1: record_id = 2.
Is case_when
able to utilize vectors? Or can I place case_when within another function?
I have changed the function to avoid hard-coding record_id
and timeframe
. Moreover, I provided an extra argument, Nt2
, to showcase the changes that were applied to column Nt
. Basically, you need to use !!enquo(colname)
or {{colname}}
to pass the arguments to the function.
library(dplyr)
library(rlang)
lag_vars <- function(df, id, time, N, B, C, Q, Nt2){
out <- df %>% group_by(!! enquo(id)) %>%
mutate(!! enquo(Nt2) := case_when(
!! enquo(time) == 0 ~ !! enquo(B),
!! enquo(time) > 0 & !! enquo(C) == 1 ~ !! enquo(Q),
!! enquo(time) > 0 & !! enquo(C) == 0 ~ lag(!! enquo(N))
))
return(out)
}
lag_vars(df1, record_id, timeframe, Nt, Bt, Ct, Qt, Ntest)
#> # A tibble: 8 × 7
#> # Groups: record_id [3]
#> record_id timeframe Nt Bt Ct Qt Ntest
#> <int> <int> <int> <int> <int> <int> <int>
#> 1 2 0 0 1 0 0 1
#> 2 2 1 NA 0 NA NA NA
#> 3 6 0 1 NA 1 1 NA
#> 4 6 1 NA NA 0 NA 1
#> 5 6 2 NA NA 1 NA NA
#> 6 9 0 1 0 NA 1 0
#> 7 9 1 0 1 1 0 0
#> 8 9 2 NA NA NA NA NA
read.table(text = "record_id timeframe Nt Bt Ct Qt
2 0 0 1 0 0
2 1 NA 0 NA NA
6 0 1 NA 1 1
6 1 NA NA 0 NA
6 2 NA NA 1 NA
9 0 1 0 NA 1
9 1 0 1 1 0
9 2 NA NA NA NA",
header = T, stringsAsFactors = F) -> df1
Created on 2024-04-02 with reprex v2.0.2