Search code examples
rdataframemergeinsertappend

Insert value from df2 into matching df1 cell if exists, append df2 rows if not exists in df1, and index if inserted or apended [R]


How can I replace values in df1 with var values in df2 if they match BOTH code1 and code2, and append non-matches to the bottom keeping the order of df1 AND indexing whether the line was edited:

Sample data:

#create sample df1
df1 <- data.frame(code1=c("A","B","C","E","E","A"),code2=c("C3","C1","C4","C3","C5","C3"),var1=c(1,2,3,4,5,6),var2=c("vA","vB","vC","vD","vE","vF"),ver=1,stringsAsFactors = F)

#create sample df2
df2 <- data.frame(code1=c("E","B","Z","Z"),code2=c("C3","C1","C3","C5"),var1=c(9,10,1,5),var2=c("vX","vY","vD","vD"),ver=2,stringsAsFactors = F)

I'd like a function that creates the following result df:


df_desired <- data.frame(code1=c("A","B","C","E","E","A","Z","Z"),code2=c("C3","C1","C4","C3","C5","C3","C3","C5"),var1=c(1,10,3,9,5,6,1,5),var2=c("vA","vY","vC","vX","vE","vF","vD","vD"),ver=c(1,2,1,2,1,1,2,2),stringsAsFactors = F)

Solution

  • We could also use the (experimental) rows_upsert from dplyr:

    library(dplyr)
    
    df1 |>
      rows_upsert(df2, by = c("code1", "code2"))
    

    Output:

      code1 code2 var1 var2 ver
    1     A    C3    1   vA   1
    2     B    C1   10   vY   2
    3     C    C4    3   vC   1
    4     E    C3    9   vX   2
    5     E    C5    5   vE   1
    6     A    C3    6   vF   1
    7     Z    C3    1   vD   2
    8     Z    C5    5   vD   2
    

    I guess ver is the indexing variable you need.