Search code examples
rdplyrtidyracross

Replacing multiple columns from different dataframe using dplyr


I have two dataframes, one of which contains a subset of IDs and columns of the other (but has different values).

ds1 <- data.frame(id = c(1:4),
                      d1 = "A",
                      d2 = "B",
                      d3 = "C")


ds2 <- data.frame(id = c(1,2),
                     d1 = "W",
                     d2 = "X")

I am hoping to use dplyr on d1 to find the shared columns, and replace their values with those found in d2, matching on ID. I can mutate them one at a time like this:

ds1 %>% 
  mutate(d1 = ifelse(id %in% ds2$id, ds2$d1[ds2$id==id],d1),
         d2 = ifelse(id %in% ds2$id, ds2$d2[ds2$id==id],d2))

In my real situation, I am needing to do this 47 times, however. With the robustness of across(), I feel there is a better way. I am open to non-dplyr solutions as well.


Solution

  • using rows_update

    library(tidyverse)
    ds1 <- data.frame(id = c(1:4),
                      d1 = "A",
                      d2 = "B",
                      d3 = "C")
    
    
    ds2 <- data.frame(id = c(1,2),
                      d1 = "W",
                      d2 = "X")
    
    rows_update(x = ds1, y = ds2, by = "id")
    #>   id d1 d2 d3
    #> 1  1  W  X  C
    #> 2  2  W  X  C
    #> 3  3  A  B  C
    #> 4  4  A  B  C
    

    Created on 2021-05-11 by the reprex package (v2.0.0)