Search code examples
rdplyrtidyverse

Mutate a column and fill its value from another column based on matchin gof two columns


I have a df

Type             Total      
Apple              6
Banana             2
Grapes             7
Unidentified       8

I want to replace the Unidentified Type value with another type (Dominant Type) that has the highest Total value. From the above example, Grapes because the Total value of Grapes 7 is the highest from all the other Total values excluding Unidentified.

Type             Total          Dominant Type
Apple              6                 NA
Banana             2                 NA
Grapes             7                 NA
Unidentified       8               Grapes

How would I complete this in dplyr?


Solution

  • Use mutate to create a new column which is conditional upon Type being equal to Unidentified; if it is, then set it equal to the maximum value of Total, excluding the value of Total which corresponds to Unidentified.

    df = fread('Type             Total      
    Apple              6
    Banana             2
    Grapes             7
    Unidentified       8')
    
    df %>% 
        dplyr::mutate(
            `Dominant Type` = ifelse(Type == "Unidentified", 
                Type[which.max(Total[!Type=="Unidentified"])], 
                NA
                )
            )
    
               Type Total      Dominant Type
    1:        Apple     6   <NA>
    2:       Banana     2   <NA>
    3:       Grapes     7   <NA>
    4: Unidentified     8 Grapes