Search code examples
rtidyversedata-cleaning

Renaming multiple levels - change "man", "Male", "M" into "male" in R


I'm handling a dataset which contains a gender column with messy data. I'd like to change "man", "Male", "M" and "MALE" all into "male".

Is there a convenient way to do this such as grouping them together and assign them the same new name "male"? I tried several packages but none of them could assign one new name to multiple old names.

Thank you so much!! This is the final project of my first semester :)


Solution

  • library(tidyverse)
    gender<-c("man", "Male", "M", "MALE", "female", "f", "F", "women")
    df<-data.frame(gender)
    

    Here is a tidyverse solution:

    df %>% 
      mutate(
        new_gender = ifelse(gender %in% c("man", "Male", "M", "MALE"), "men", "women") 
      )
    

    You get:

     gender new_gender
    1    man        men
    2   Male        men
    3      M        men
    4   MALE        men
    5 female      women
    6      f      women
    7      F      women
    8  women      women