Search code examples
rdplyrcoalesce

get results of coalesce in R


I have the following data:

tibble(car1 = c("honda", NA, NA), 
       car2 = c(NA, "mazda", NA), 
       car3 = c("toyota", "ferrari", "porsche")
) %>%
          mutate(pick = coalesce(car1, car2, car3))

How can I capture the result of the selected column from coalesce into a new column (col_choice)?
Ideally I want the data to look as follows:

  car1  car2  car3    pick    col_choice
1 honda NA    toyota  honda   car1
2 NA    mazda ferrari mazda   car2
3 NA    NA    porsche porsche car3


Solution

  • coalesce returns the first non-NA element from the selected columns for each row. If we want to get the column name, apply max.col on a logical matrix (TRUE for non-NA) to select the first index to subset the column name

    library(dplyr)
    library(stringr)
    tibble(car1 = c("honda", NA, NA), 
           car2 = c(NA, "mazda", NA), 
           car3 = c("toyota", "ferrari", "porsche")
    ) %>%
              mutate(pick = coalesce(car1, car2, car3), 
           col_choice = str_subset(names(.),
            '^car')[max.col(!is.na(across(starts_with('car'))), 'first')])
    

    -output

    # A tibble: 3 × 5
      car1  car2  car3    pick    col_choice
      <chr> <chr> <chr>   <chr>   <chr>     
    1 honda <NA>  toyota  honda   car1      
    2 <NA>  mazda ferrari mazda   car2      
    3 <NA>  <NA>  porsche porsche car3