Search code examples
rcluster-analysisanalyticsdata-analysisanalysis

Coalesce columns in df


structure(list(Primary.Warning.Vertical = c("N/A", "N/A", "N/A", 
"N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A"), Primary.Warning.Horizontal = c("2", 
"2", "1.1", "2", "2", "2", "1.7", "2", "2", "2"), Secondary.Sensor.Warning.Vertical = c("N/A", 
"N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A"
), Secondary.Sensor.Warning.Horizontal = c("N/A", "N/A", "N/A", 
"N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A")), row.names = c(NA, 
10L), class = "data.frame")

throughout the larger df one of these 4 columns will have a value in it. I am trying to create a column that will coalesce the non NA values into one

I've tried:

df %>% mutate(TotalWarning = coalesce(Primary.Warning.Vertical, Primary.Warning.Horizontal, Secondary.Sensor.Warning.Vertical, Secondary.Sensor.Warning.Horizontal)) 

but the TotalWarning column is full of NA's.

Thanks for the help!


Solution

  • coalesce() only works on "real" missing values. In your data, "N/A" is character, so at first you need to convert them to NA.

    library(dplyr)
    
    df %>%
      mutate(across(where(is.character), na_if, "N/A"),
             TotalWarning = coalesce(Primary.Warning.Vertical,
                                     Primary.Warning.Horizontal,
                                     Secondary.Sensor.Warning.Vertical,
                                     Secondary.Sensor.Warning.Horizontal))
    
    #    Primary.Warning.Vertical Primary.Warning.Horizontal Secondary.Sensor.Warning.Vertical Secondary.Sensor.Warning.Horizontal TotalWarning
    # 1                      <NA>                          2                              <NA>                                <NA>            2
    # 2                      <NA>                          2                              <NA>                                <NA>            2
    # 3                      <NA>                        1.1                              <NA>                                <NA>          1.1
    # 4                      <NA>                          2                              <NA>                                <NA>            2
    # 5                      <NA>                          2                              <NA>                                <NA>            2
    # 6                      <NA>                          2                              <NA>                                <NA>            2
    # 7                      <NA>                        1.7                              <NA>                                <NA>          1.7
    # 8                      <NA>                          2                              <NA>                                <NA>            2
    # 9                      <NA>                          2                              <NA>                                <NA>            2
    # 10                     <NA>                          2                              <NA>                                <NA>            2
    

    Your variable names are too tedious. To simplify the code, you can also do this:

    df %>%
      mutate(across(where(is.character), na_if, "N/A"),
             TotalWarning = do.call(coalesce, cur_data()))