I have two logical vectors in a data frame:
df <- data.frame(log1 = c(FALSE, FALSE, TRUE, FALSE, TRUE), log2 = c(TRUE, FALSE, FALSE, FALSE, TRUE))
I want to make a third column by combining these two. But this new column should not simply contain logical values. Instead, it should assign one of three values - "high", "outlier", or "normal" to the third column. "High" takes precedence, so the third column should show "high" and not "outlier" for row 5.
I guess it's possible to do this with using if
and else
, but I couldn't make it work using the following code:
df$new <- NA
if(df$log1 == TRUE){
df$new <- "high"
} else if(df$log2 == TRUE) {
df$new <- "outlier"
} else {
df$new <- "normal"
}
Can anyone help?
This is all about ifelse
and its derivatives.
ifelse(df$log1, "high", ifelse(df$log2, "outlier", "normal"))
# [1] "outlier" "normal" "high" "normal" "high"
We can nest dplyr::if_else
, but nesting generally encourages us to use case_when
.
library(dplyr)
df %>%
mutate(
new1 = if_else(log1, "high", if_else(log2, "outlier", "normal")),
new2 = case_when(log1 ~ "high", log2 ~ "outlier", TRUE ~ "normal")
)
# log1 log2 new1 new2
# 1 FALSE TRUE outlier outlier
# 2 FALSE FALSE normal normal
# 3 TRUE FALSE high high
# 4 FALSE FALSE normal normal
# 5 TRUE TRUE high high
Similarly, fifelse
and fcase
:
library(data.table)
as.data.table(df)[, new1 := fifelse(log1, "high", fifelse(log2, "outlier", "normal"))
][, new2 := fcase(log1, "high", log2, "outlier", default = "normal")][]
# log1 log2 new1 new2
# <lgcl> <lgcl> <char> <char>
# 1: FALSE TRUE outlier outlier
# 2: FALSE FALSE normal normal
# 3: TRUE FALSE high high
# 4: FALSE FALSE normal normal
# 5: TRUE TRUE high high
Note that while dplyr::case_when
above uses tilde-formulas as in cond1 ~ value1, cond2 ~ value2
, the fcase
variant uses alternating arguments, cond1, value1, cond2, value2, ...)
.
Also, the default=
argument works so long as it is a constant. If a dynamic default value (i.e., based on table contents) is desired, then one needs to have an all-true vector as in fcase(..., rep(TRUE, .N), NEWVALUE)
.