Search code examples
rif-statementdplyrrecode

Recoding turns everything into the same value in R


I'm practicing R and I created a new column that had continuous numbers in them called ROI, and wanted to recode the number values into string values in R like this:

df = mutate(diabetes_df, ROI = ifelse(ROI < 18.5, 'Under', ROI))
df = mutate(diabetes_df, ROI = ifelse(ROI >= 18.5 & ROI <= 25, 'average', ROI))

diabetes_df = mutate(diabetes_df, ROI = ifelse(ROI > 25 & BMI <= 30, 'above average', ROI))

This works normally and it displays these words wherever the condition is met, however when i put the last ifelse statement :

df = mutate(diabetes_df, ROI = ifelse(ROI > 30, 'OVER', ROI))

It turns every value in the new column I made into the OVER value. I was wondering if anyone knew how to make it so that it would only say OVER for where the condition is met?


Solution

  • We can replicate the problem with the mtcars data frame. The following code on the third mutate() statement results in all rows getting the wt value set to High because after the first mutate(), the wt column is a vector of character values.

    library(dplyr)
    data(mtcars)
    mtcars <- mutate(mtcars,wt = ifelse(wt < 2.6,"Low", wt))
    # at this point, wt is character
    str(mtcars$wt)
    
    
    > str(mtcars$wt)
     chr [1:32] "2.62" "2.875" "Low" "3.215" "3.44" "3.46" "3.57" "3.19" "3.15" ...
    

    By the third mutate() all rows meet the condition of TRUE for the if_else() based on a character string comparison where the string values of Low and Medium are greater than the number 3.61.

    mtcars <- mutate(mtcars, wt = ifelse( 2.6 <= wt & wt <= 3.61,"Medium",wt))
    mtcars <- mutate(mtcars, wt = ifelse( wt > 3.61,"High",wt))
    

    ...and the output:

    > head(mtcars)
                       mpg cyl disp  hp drat   wt  qsec vs am gear carb
    Mazda RX4         21.0   6  160 110 3.90 High 16.46  0  1    4    4
    Mazda RX4 Wag     21.0   6  160 110 3.90 High 17.02  0  1    4    4
    Datsun 710        22.8   4  108  93 3.85 High 18.61  1  1    4    1
    Hornet 4 Drive    21.4   6  258 110 3.08 High 19.44  1  0    3    1
    Hornet Sportabout 18.7   8  360 175 3.15 High 17.02  0  0    3    2
    Valiant           18.1   6  225 105 2.76 High 20.22  1  0    3    1
    

    We can prevent this behavior by using case_when(), which makes all of the comparisons to the numeric version of wt in a single pass of the data.

    # use case_when()
    data(mtcars)
    mtcars %>% mutate(wt = case_when(
         wt < 2.6 ~ "Low",
         wt >= 2.6 & wt <= 3.61 ~ "Medium",
         wt > 3.61 ~ "High"
    )) %>% head(.)
    

    ...and the output:

    head(.)
                       mpg cyl disp  hp drat     wt  qsec vs am gear carb
    Mazda RX4         21.0   6  160 110 3.90 Medium 16.46  0  1    4    4
    Mazda RX4 Wag     21.0   6  160 110 3.90 Medium 17.02  0  1    4    4
    Datsun 710        22.8   4  108  93 3.85    Low 18.61  1  1    4    1
    Hornet 4 Drive    21.4   6  258 110 3.08 Medium 19.44  1  0    3    1
    Hornet Sportabout 18.7   8  360 175 3.15 Medium 17.02  0  0    3    2
    Valiant           18.1   6  225 105 2.76 Medium 20.22  1  0    3    1
    > 
    

    From the comments to this answer, it wasn't clear to the OP how to save the changed column to the existing data frame. The following code snippet addresses that question.

    data(mtcars)
    mtcars %>% mutate(wt = case_when(
         wt < 2.6 ~ "Low",
         wt >= 2.6 & wt <= 3.61 ~ "Medium",
         wt > 3.61 ~ "High"
    )) -> mtcars