I have a dataset DT as below:
category: number 1-9
xxx, yyy, zzz: binary (0,1)
category xxx yyy zzz
8 1 0 0
1 0 0 0
4 0 1 1
9 0 0 1
8 0 1 0
I would like to check multiple conditions using 'for' loop and 'case_when.'
So, I wish the data shows at the end
category xxx yyy zzz result_xxx result_yyy result_zzz
8 1 0 0 8 0 0
1 0 0 0 0 0 0
4 0 1 1 0 4 4
9 0 0 1 0 0 9
8 0 1 0 0 8 0
To do it, I wrote a code below:
condition.vars <- c("xxx", "yyy", "zzz")
for(i in condition.vars){
browser()
DT <- DT[, condition:= case_when(
([[i]] == 1 & category ==1) ~ 1,
([[i]] == 1 & category ==2) ~ 2,
([[i]] == 1 & category ==3) ~ 3,
([[i]] == 1 & category ==4) ~ 4,
([[i]] == 1 & category ==5) ~ 5,
([[i]] == 1 & category ==6) ~ 6,
([[i]] == 1 & category ==7) ~ 7,
([[i]] == 1 & category ==8) ~ 8,
([[i]] == 1 & category ==9) ~ 9,
TRUE ~ 0
)]
setnames(DT, "condition", paste0("result", i))
}
And as you expect, it does not work.
Would you please help me to correct my code?
You don't need a for
loop or case_when
. If you have a dataframe you can simplify this as :
condition.vars <- c("xxx", "yyy", "zzz")
DT[paste0('result_', condition.vars)] <- DT$category * DT[condition.vars]
# category xxx yyy zzz result_xxx result_yyy result_zzz
#1 8 1 0 0 8 0 0
#2 1 0 0 0 0 0 0
#3 4 0 1 1 0 4 4
#4 9 0 0 1 0 0 9
#5 8 0 1 0 0 8 0
If DT
is data.table
you can do :
library(data.table)
DT[, paste0('result_', condition.vars):= category * .SD,.SDcols = condition.vars]