I have two dataframes:
df1 <- data.frame(row1 = c("A", "A", "A", "B", "B", "B", "C", "C", "C"),
row2 = c("aa", "bb", "cc", "aa", "bb", "cc", "aa", "bb", "cc"),
row3 = c(1, 5, 7, 6, 8, 9, 3, 2, 4))
row1 row2 row3
1 A aa 1
2 A bb 5
3 A cc 7
4 B aa 6
5 B bb 8
6 B cc 9
7 C aa 3
8 C bb 2
9 C cc 4
and
df2 <- data.frame(row1 = c("A", "A", "B", "B", "B", "C", "C"),
row2 = c("aa", "bb", "aa", "bb", "cc", "aa", "cc"))
row1 row2
1 A aa
2 A bb
3 B aa
4 B bb
5 B cc
6 C aa
7 C cc
I would like to create row 3 for df2 based on rows 1 and 2. The desired output looks like this:
row1 row2 row3
1 A aa 1
2 A bb 5
3 B aa 6
4 B bb 8
5 B cc 9
6 C aa 3
7 C cc 4
In my real dataframe the df2 has many more columns (but row3 should be added based on two specific columns). I also tried it with left join (unfortunately didn't save my approach) but this also didn't work with my implementation.
I found many solutions for "lookups" but all of these compare only one column and i couldn't figure out how two compare multiple columns. There is probably an easy solution, but i'm stuck. Can anyone help me with this?
Indeed, dplyr::left_join()
is what you need here.
Also, you're using names like row1
for things that are actually columns. It might be leading to some confusion.
library(dplyr)
df1 <- data.frame(row1 = c("A", "A", "A", "B", "B", "B", "C", "C", "C"),
row2 = c("aa", "bb", "cc", "aa", "bb", "cc", "aa", "bb", "cc"),
row3 = c(1, 5, 7, 6, 8, 9, 3, 2, 4))
df2 <- data.frame(row1 = c("A", "A", "B", "B", "B", "C", "C"),
row2 = c("aa", "bb", "aa", "bb", "cc", "aa", "cc"))
left_join(df2, df1)
#> Joining, by = c("row1", "row2")
#> row1 row2 row3
#> 1 A aa 1
#> 2 A bb 5
#> 3 B aa 6
#> 4 B bb 8
#> 5 B cc 9
#> 6 C aa 3
#> 7 C cc 4
Created on 2023-02-01 with reprex v2.0.2