I have data frame with 18 columns. Columns 2 to 13 include numeric values such as 0, 1, 2, 4 ... I want to recode them based on range into three categories:
if columns 2:13 are 0 -> 0
if columns 2:13 between 1 & 5 -> 1
else columns 2:13 >- 2.
My attempt works, but not efficient:
df[,2:13][df[,2:13] == 1 | df[,2:13] == 2 | df[,2:13] == 3 | df[,2:13] == 4 | df[,2:13] == 5] <- 1
I appreciate your help.
Try findInterval
:
library(dplyr)
df %>%
mutate(
across(2:13, ~ findInterval(., c(0, 1, 5), rightmost.closed = TRUE) - 1L)
)
If this gets any more complex (such as non-consecutive recoded values), we might switch to case_when
:
df %>%
mutate(
across(2:13, ~ case_when(
. == 0 ~ 0L,
between(., 1, 5) ~ 1L,
TRUE ~ 2L
))
)
df[,2:13] <- lapply(df[,2:13], function(z) findInterval(z, c(0, 1, 5), rightmost.closed = TRUE) - 1L)