Search code examples
rdplyrtidyverserecode

How to use recode_factor in dplyr for recoding multiple factor values?


     countrycode event
1713         ESP 110mh
1009         NED    HJ
536          BLR    LJ
2882         FRA 1500m
509          EST    LJ
2449         BEL    PV
1022         EST    HJ
2530         USA    JT
2714         CUB    JT
1236         HUN  400m
238          BLR  100m
2518         USA    JT
1575         FRA 110mh
615          JPN    LJ
1144         GER    HJ
596          CAN    LJ
2477         HUN    JT
1046         GER    HJ
2501         FIN    DT
2176         KAZ    PV

I want to create a new factor vector in my data frame, eventtype, where:

Rows with 100m, 400m, 110mh, 1500m in the event variable become grouped as Runs; DT, SP, JT gets grouped as Throws; and LJ, HJ, PV, gets grouped as Jumps.

I can create a new vector value individually with something like df$eventtype <- recode_factor(df$event, `100m`="Running") works for one event, but I looked in the documentation and there isn't an easy way to convert multiple values in one function call.

Edit: of course if there is another function which serves my purposes better I will use that.


Solution

  • The ... argument of the recode_factor function can take any number of arguments...

    library(dplyr)
    
    df <- read.table(header = T, text = "
    number countrycode event
    1713         ESP 110mh
    1009         NED    HJ
    536          BLR    LJ
    2882         FRA 1500m
    509          EST    LJ
    2449         BEL    PV
    1022         EST    HJ
    2530         USA    JT
    2714         CUB    JT
    1236         HUN  400m
    238          BLR  100m
    2518         USA    JT
    1575         FRA 110mh
    615          JPN    LJ
    1144         GER    HJ
    596          CAN    LJ
    2477         HUN    JT
    1046         GER    HJ
    2501         FIN    DT
    2176         KAZ    PV
    ")
    
    df$eventtype <- recode_factor(df$event, `100m` = "Runs", `400m` = "Runs", 
                                  `110mh` = "Runs", `1500m` = "Runs", 
                                  DT = "Throws", SP = "Throws", JT = "Throws",
                                  LJ = "Jumps", HJ = "Jumps", PV = "Jumps")
    
    # or inside a mutate command
    df %>% 
      mutate(eventtype = recode_factor(event, `100m` = "Runs", `400m` = "Runs", 
                                       `110mh` = "Runs", `1500m` = "Runs", 
                                       DT = "Throws", SP = "Throws", JT = "Throws",
                                       LJ = "Jumps", HJ = "Jumps", PV = "Jumps"))