I have two data frames (df1 and df2) as you can see in the tables below. Based on the conditions in df1, I need to check the values in df2. The expected output is also presented.
The condition is based on the columns dep
and value
in df1
checked in df2
against the the variables taken from the column var
in df1
. Please see the following examples:
For instance, based on row 1 in df1, we judge whether the values in column A in df2 are TRUE or FALSE.
if E == 1, A == TRUE
if E != 1
, we need to check the following conditions:
- if A == NA, A == TRUE
- if A == any values other than NA, A == FALSE
If A and E are both NAs, A == TRUE
Here is the code that works for the first two conditions, my question is how to add the third condition:
library(tidyverse)
library(rlang)
library(glue)
rules_1 <- tibble::tribble(
~var, ~value, ~dep,
"A", "==1", "E",
"B", "==1", "E",
"C", "!=0", "A",
"D", "==2", "G",
"E", NA, NA,
"F", NA, NA,
"G", "%in% c('b','d')", "F",
)
df2 <- data.frame(
stringsAsFactors = FALSE,
ID = c("1q", "2d", "4f", "3g", "8j", "5g", "9l"),
B = c(1L, 1L, NA, 1L, 2L, NA, 1L),
G = c(3L, 3L, NA, 2L, 2L, NA, NA),
A = c(0L, 0L, 1L, 1L, 1L, NA, NA),
C = c(NA, 1L, 1L, NA, NA, 1L, 1L),
D = c(NA, 1L, 1L, 1L, 1L, 3L, 2L),
E = c(2L, 2L, 1L, NA, NA, 3L, 1L),
F = letters[1:7]
)
# And for variables that have NA values in df1, we do not need to do anything.
(rules_2 <- filter(rules_1,
!is.na(dep)))
# rules from data
(rules_3 <- mutate(rules_2,
rule = glue("case_when({dep}{value}~TRUE,is.na({var})~TRUE,TRUE ~ FALSE)")))
(mutators <- rules_3$rule)
names(mutators) <- rules_3$var
(parsed_mutators <- rlang::parse_exprs(mutators))
mutate(df2,
!!!parsed_mutators)
extend the case_when logic
case_when( {dep}{value}~TRUE,
is.na({dep}) & is.na({var}) ~ TRUE,
is.na({var})~TRUE,
TRUE ~ FALSE)