Search code examples
rdplyrcoalesce

Iteratively dplyr::coalesce()


I have a dataset that I am needing to use dplyr::coalesce() on. But I want to do this multiple times and am not sure about what is a more efficient way of doing this (e.g. loop, apply, etc).

To give you a toy example, say my dataset is:

df = data.frame(
    a = c(1, NA, NA),
    a.1 = c(NA, 1, NA),
    a.2 = c(NA, NA, 1),
    b = c(2, NA, NA),
    b.1 = c(NA, 2, NA),
    b.2 = c(NA, NA, 2),
    c = c(3, NA, NA),
    c.1 = c(NA, 3, NA),
    c.2 = c(NA, NA, 3)
)

And I could do this:

new_df = df |>
    dplyr::mutate(
        a = dplyr::coalesce(a, a.1, a.2),
        b = dplyr::coalesce(b, b.1, b.2),
        c = dplyr::coalesce(c, c.1, c.2)
    ) |>
    dplyr::select(a, b, c)

Which would give me:

new_df
  a b c
1 1 2 3
2 1 2 3
3 1 2 3

First, how could I efficiently do this without having to write coalesce n times? This example here is just an example and I'd really need to do this forty times with the dataset.

Also, is there a way to do it as I have here where I basically just keep a, b, and c rather than naming it as a.1 or whatever?


Solution

  • If columns are like something and somthing.etc shape,

    you may try

    library(dplyr)
    library(stringr)
    df %>%
      split.default(str_remove(names(.), "\\..*")) %>%
      map_df(~ coalesce(!!! .x))
    
          a     b     c
      <dbl> <dbl> <dbl>
    1     1     2     3
    2     1     2     3
    3     1     2     3