Search code examples
rreplacerecode

Recode multiple values in multiple columns with new values in R


I have data frame with 18 columns. Columns 2 to 13 include numeric values such as 0, 1, 2, 4 ... I want to recode them based on range into three categories:

if columns 2:13 are 0 -> 0
if columns 2:13 between 1 & 5 -> 1
else columns 2:13 >- 2.

My attempt works, but not efficient:

df[,2:13][df[,2:13] == 1 | df[,2:13] == 2 | df[,2:13] == 3 | df[,2:13] == 4 | df[,2:13] == 5] <- 1

I appreciate your help.


Solution

  • Try findInterval:

    dplyr

    library(dplyr)
    df %>%
      mutate(
        across(2:13, ~ findInterval(., c(0, 1, 5), rightmost.closed = TRUE) - 1L)
      )
    

    If this gets any more complex (such as non-consecutive recoded values), we might switch to case_when:

    df %>%
      mutate(
        across(2:13, ~ case_when(
          . == 0           ~ 0L,
          between(., 1, 5) ~ 1L,
          TRUE             ~ 2L
        ))
      )
    

    base R

    df[,2:13] <- lapply(df[,2:13], function(z) findInterval(z, c(0, 1, 5), rightmost.closed = TRUE) - 1L)