r cluster-analysis analytics data-analysis analysis

Coalesce columns in df

structure(list(Primary.Warning.Vertical = c("N/A", "N/A", "N/A", 
"N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A"), Primary.Warning.Horizontal = c("2", 
"2", "1.1", "2", "2", "2", "1.7", "2", "2", "2"), Secondary.Sensor.Warning.Vertical = c("N/A", 
"N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A"
), Secondary.Sensor.Warning.Horizontal = c("N/A", "N/A", "N/A", 
"N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A")), row.names = c(NA, 
10L), class = "data.frame")

throughout the larger df one of these 4 columns will have a value in it. I am trying to create a column that will coalesce the non NA values into one

I've tried:

df %>% mutate(TotalWarning = coalesce(Primary.Warning.Vertical, Primary.Warning.Horizontal, Secondary.Sensor.Warning.Vertical, Secondary.Sensor.Warning.Horizontal))

but the TotalWarning column is full of NA's.

Thanks for the help!

Solution

coalesce() only works on "real" missing values. In your data, "N/A" is character, so at first you need to convert them to NA.

library(dplyr)

df %>%
  mutate(across(where(is.character), na_if, "N/A"),
         TotalWarning = coalesce(Primary.Warning.Vertical,
                                 Primary.Warning.Horizontal,
                                 Secondary.Sensor.Warning.Vertical,
                                 Secondary.Sensor.Warning.Horizontal))

#    Primary.Warning.Vertical Primary.Warning.Horizontal Secondary.Sensor.Warning.Vertical Secondary.Sensor.Warning.Horizontal TotalWarning
# 1                      <NA>                          2                              <NA>                                <NA>            2
# 2                      <NA>                          2                              <NA>                                <NA>            2
# 3                      <NA>                        1.1                              <NA>                                <NA>          1.1
# 4                      <NA>                          2                              <NA>                                <NA>            2
# 5                      <NA>                          2                              <NA>                                <NA>            2
# 6                      <NA>                          2                              <NA>                                <NA>            2
# 7                      <NA>                        1.7                              <NA>                                <NA>          1.7
# 8                      <NA>                          2                              <NA>                                <NA>            2
# 9                      <NA>                          2                              <NA>                                <NA>            2
# 10                     <NA>                          2                              <NA>                                <NA>            2

Your variable names are too tedious. To simplify the code, you can also do this:

df %>%
  mutate(across(where(is.character), na_if, "N/A"),
         TotalWarning = do.call(coalesce, cur_data()))