Matching the same lines from 2 different files and 2 columns

I have two files, and I would like to merge them based on the first two columns, if the lines match, and add the third column of file2 to file1.

File1

CHR BP SNP ZSCORE
1   2534 rs123 0.5
1   2860 rs225 0.1
1   10000 rs356 0.8

File2

CHR BP AF
1   2534 0.02
1   6538 0.1
1   12345 0.1

What I want is:

File3

CHR BP SNP ZSCORE AF
1   2534 rs123 0.5 0.02

I tried R's merge(), but it led to just adding the columns together in the new dataset:

merged<-merge(File1, File2, by.x = "BP", by.y = "CHR", all = TRUE)
merged<-merge(File1, File2, by.x = "CHR", by.y = "BP", all = TRUE)

The next merge() I tried:

merged <- merge(File1[c('CHR','BP')],File2)

produced additional rows, which I don't think it's correct. How can matching a file of 17 million rows against the file of 17,200,000 rows produce 18,000,000 rows?!

Tried dplyr left join:

merged<-dplyr::left_join(File1, File2, by = c("CHR", "BP"))

Warning message:
In dplyr::left_join(File1, File2, by = c("CHR", "BP")) :
  Detected an unexpected many-to-many relationship between `x` and `y`.
ℹ Row 5 of `x` matches multiple rows in `y`.
ℹ Row 3570136 of `y` matches multiple rows in `x`.
ℹ If a many-to-many relationship is expected, set `relationship = "many-to-many"` to silence this warning.

Reversing the files led to the same error.

I also tried various versions of awk and join, but didn't produce what I wanted.

If anyone knows the solution, please let me know, doesn't matter whether it's an R solution, or awk/join-all.

Solution

Using the input in the Note at the end

1) merge This uses base R merge

merge(File1, File2, by = 1:2)
##   CHR   BP   SNP ZSCORE   AF
## 1   1 2534 rs123    0.5 0.02

2) inner_join This uses dplyr inner_join:

library(dplyr)
inner_join(File1, File2, join_by(CHR, BP))
##   CHR   BP   SNP ZSCORE   AF
## 1   1 2534 rs123    0.5 0.02

3 sqldf If the files are too large it can be done outside of R provided the result is still small enough for R using sqldf:

# create test files
write.csv(File1, "File1.csv", quote = FALSE, row.names = FALSE)
write.csv(File2, "File2.csv", quote = FALSE, row.names = FALSE)
rm(File1, File2)

sqldf("select * from File1 inner join File2 using (CHR, BP)", 
  dbname = tempfile())
##   CHR   BP   SNP ZSCORE   AF
## 1   1 2534 rs123    0.5 0.02

4) xsv Another way to do it outuside of R is to use the xsv command line utility. Assuming the same setup as for sqldf

library(data.table)
fread("xsv join CHR,BP File1.csv CHR,BP File2.csv")
##    CHR   BP   SNP ZSCORE CHR   BP   AF
## 1:   1 2534 rs123    0.5   1 2534 0.02

Note

Lines1 <- "
CHR BP SNP ZSCORE
1   2534 rs123 0.5
1   2860 rs225 0.1
1   10000 rs356 0.8e"

Lines2 <- "CHR BP AF
1   2534 0.02
1   6538 0.1
1   12345 0.1"

File1 <- read.table(text = Lines1, header = TRUE)
File2 <- read.table(text = Lines2, header = TRUE)