Search code examples
rfactors

Replace values in column by factor level


I got a survey data.frame they are 100 columns and each columns have 2 factors - Yes or No. However some survey have answers like, Yes! or Nope or Yay or Nah... which really they are yes or no.

My question is how can I achieve my converting all values in other columns based on their factor level? e.g if factor level is 1 replace text to Yes else No.

My second question is, sometimes I am left with the 3rd level that isn't used, how can I remove all unused factors in ALL columns in data frame. I got more than 100 columns.


Solution

  • We can loop over the columns and replace the levels using %in%

    df1[] <- lapply(df1, function(x) {
                levels(x)[levels(x) %in% c("Yes!", "Yay")] <- "Yes"
                levels(x)[levels(x) %in% c("Nope", "Nah")] <- "No"
              x
            })
    

    To drop the unused levels we can use droplevels

    df2 <- droplevels(df1)
    

    But, based on the assignment we did earlier, it would be taken care off.

    df1
    #   Col1 Col2 Col3
    #1   Yes   No   No
    #2   Yes  Yes   No
    #3    No   No   No
    #4    No   No   No
    #5    No  Yes   No
    #6    No   No   No
    #7   Yes  Yes   No
    #8    No  Yes   No
    #9    No   No   No
    #10  Yes  Yes   No
    
    
    str(df1)
    #'data.frame':   10 obs. of  3 variables:
    #$ Col1: Factor w/ 2 levels "No","Yes": 2 2 1 1 1 1 2 1 1 2
    #$ Col2: Factor w/ 2 levels "No","Yes": 1 2 1 1 2 1 2 2 1 2
    #$ Col3: Factor w/ 1 level "No": 1 1 1 1 1 1 1 1 1 1
    

    data

    set.seed(24)
    df1 <- data.frame(Col1 = sample(c("Yes", "Yes!", "Yay", "Nope", "Nah", "No"),
             10, replace=TRUE),
    
                   Col2 = sample(c("Yes", "Yes!", "Yay", "Nope", "Nah", "No"), 10, replace=TRUE),
                   Col3 = sample(c("Nope", "Nah", "No"), 10, replace=TRUE)
                 )