Search code examples
rfunctionloopscoalesce

Coalesce multiple pairs of columns by name


I have dataframe of measurements from cancer-treatments that i want to 'merge' pairwise using coalesce(). But the frame contains 100+ columns so i want to use some kind of function or loop.

Here's what my dataframe looks like

I have a vector with the names of the treatments:

drugs <- c("A", "C", "B", "D")

The columns represent multiple measurements from a drug treatment. I want to merge the columns pairwise: A_1 + A_2 into A (new column), B_1, etc

This code works:

df <- df %>% mutate(A = coalesce(A_1, A_2).

But the frame has 100+ columns so I want to use some kind of function or loop, using the value from the vector with drugnames. From each drug there are 2 columns but they are not in the correct order, so I cannot use numbering, I have to use the name of the column. But when I put that into a function it doesn't work.

One addition: I would like to have the resulting columns (A,B,C) etc added to the frame.


Solution

  • Updated for the suffixes _ViMIC and _EtMIC:

    library(tidyverse)
    
    old_df <- tribble(
      ~A_EtMIC, ~B_EtMIC, ~C_EtMIC, ~A_ViMIC, ~C_ViMIC, ~B_ViMIC,
      1, NA, 2, 1, 0, 6,
      5, 5, NA, 2, 1, NA,
      NA, 3, NA, 3, 3, 7,
      1, 6, 3, 4, 2, NA,
      10, 8, 5, 5, NA, NA
    ) 
    
    new_df <- old_df |> 
      mutate(row = row_number()) |> 
      pivot_longer(-row, names_to = c("prefix", "suffix"), 
                   names_pattern = "(.*)_(..MIC)") |> 
      pivot_wider(names_from = suffix, 
                  values_from = value) |> 
      mutate(coalesced = coalesce(EtMIC, ViMIC)) |> 
      select(- ends_with("MIC")) |> 
      pivot_wider(names_from = prefix, values_from = coalesced,
                  names_glue = "{prefix}_MIC")
    
    both_df <- bind_cols(new_df, old_df)
    
    both_df
    #> # A tibble: 5 × 10
    #>     row A_MIC B_MIC C_MIC A_EtMIC B_EtMIC C_EtMIC A_ViMIC C_ViMIC B_ViMIC
    #>   <int> <dbl> <dbl> <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
    #> 1     1     1     6     2       1      NA       2       1       0       6
    #> 2     2     5     5     1       5       5      NA       2       1      NA
    #> 3     3     3     3     3      NA       3      NA       3       3       7
    #> 4     4     1     6     3       1       6       3       4       2      NA
    #> 5     5    10     8     5      10       8       5       5      NA      NA
    

    Created on 2022-06-05 by the reprex package (v2.0.1)