Search code examples
rnamissing-data

Coalesce two string columns with alternating missing values to one


I have a data frame with two columns "a" and "b" with alternating missing values (NA)

a      b
dog    <NA>
mouse  <NA>
<NA>   cat
bird   <NA>

I want to "merge" / combine them to a new column c that looks like this, i.e. the non-NA element in each row is selected:

c
dog
mouse
cat
bird

I tried merge and join, but neither worked as I wanted. Maybe because I do not have an id with which to merge? For integers I would just circumvent this and add both columns, but how in my case?


Solution

  • You may try pmax

    df$c <- pmax(df$a, df$b)
    df
    #       a    b     c
    # 1   dog <NA>   dog
    # 2 mouse <NA> mouse
    # 3  <NA>  cat   cat
    # 4  bird <NA>  bird
    

    ...or ifelse:

    df$c <- ifelse(is.na(df$a), df$b, df$a)

    For more general solutions in cases with more than two columns, you find several ways to implement coalesce in R here.