Search code examples
rrecode

Recoding range of numerics into single numeric in R


I am trying to recode a data frame with four columns. Across all of the columns, I want to recode all the numeric values into these ordinal numeric values:

  • 0 stays as is
  • 1:3 <- 1
  • 4:10 <- 2
  • 11:22 <- 3
  • 22:max <-4

This is the data frame:

> df
   T4.1 T4.2 T4.3 T4.4
1     0   54    0    5
2     0    5    0    0
3     0    3    0    0
4     0    2    0    0
5     0    3    0    0
6     0    2    0    0
7     0    4    0    0
8     1   20    0    0
9     1    7    0    2
10    0   14    0    0
11    0    3    0    0
12    0  202    0   41
13    2   12    0    0
14    3    6    0    0
15    3   21    0    3
16    0  143    0    0
17    0    0    0    0
18    4    9    0    0
19    3   15    0    0
20    0   58    0    6
21    2    0    0    0
22    0   52    0    0
23    0    3    0    0
24    0    1    0    0
25    4    6    0    1
26    1    4    0    0
27    0   38    0    1
28    0    6    0    0
29    0    8    0    0
30    0   29    0    4
31    1   14    0    0
32    0   12    0   10
33    4    1    0    3

I'm trying to use the recode function, but I can't seem to figure out how to input a range of numeric values into it. I get the following errors with my attempts:

> recode(df, 11:22=3)
Error: unexpected '=' in "recode(df, 11:22="
> recode(df, c(11:22)=3)
Error: unexpected '=' in "recode(df, c(11:22)="

I would greatly appreciate any advice. Thanks for your time!

Edit: Thanks all for the help!!


Solution

  • You can use cut with range of values as:

    df_res <- as.data.frame(sapply(df, function(x)cut(x, 
                breaks = c(-0.5, 0.5, 3.5, 10.5, 22.5, Inf), 
                labels = c(0, 1, 2, 3, 4)))
               )
    
    str(df_res)
    #'data.frame':  33 obs. of  4 variables:
    # $ T4.1: Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 2 2 1 ...
    # $ T4.2: Factor w/ 5 levels "0","1","2","3",..: 5 3 2 2 2 2 3 4 3 4 ...
    # $ T4.3: Factor w/ 1 level "0": 1 1 1 1 1 1 1 1 1 1 ...
    # $ T4.4: Factor w/ 4 levels "0","1","2","4": 3 1 1 1 1 1 1 1 2 1 ...
    
    df_res
    #    T4.1 T4.2 T4.3 T4.4
    # 1     0    4    0    2
    # 2     0    2    0    0
    # 3     0    1    0    0
    # 4     0    1    0    0
    # 5     0    1    0    0
    # 6     0    1    0    0
    # 7     0    2    0    0
    # 8     1    3    0    0
    # 9     1    2    0    1
    # 10    0    3    0    0
    # 11    0    1    0    0
    # 12    0    4    0    4
    # 13    1    3    0    0
    # 14    1    2    0    0
    # 15    1    3    0    1
    # 16    0    4    0    0
    # 17    0    0    0    0
    # 18    2    2    0    0
    # 19    1    3    0    0
    # 20    0    4    0    2
    # 21    1    0    0    0
    # 22    0    4    0    0
    # 23    0    1    0    0
    # 24    0    1    0    0
    # 25    2    2    0    1
    # 26    1    2    0    0
    # 27    0    4    0    1
    # 28    0    2    0    0
    # 29    0    2    0    0
    # 30    0    4    0    2
    # 31    1    3    0    0
    # 32    0    3    0    2
    # 33    2    1    0    1