Search code examples
sqlrmergeinner-join

Conditional JOIN on two data frames in R


Suppose there are two data frames likes the following (given from this post):

df1 = data.frame(CustomerId = c(1:6), Product = c(rep("Toaster", 3), rep("Radio", 3)))
df2 = data.frame(CustomerId = c(2, 4, 6), State = c(rep("Alabama", 2), rep("Ohio", 1)))

df1
#  CustomerId Product
#           1 Toaster
#           2 Toaster
#           3 Toaster
#           4   Radio
#           5   Radio
#           6   Radio

df2
#  CustomerId   State
#           2 Alabama
#           4 Alabama
#           6    Ohio

The question is how can I do the following sql query in R:

SELECT * FROM df1 JOIN df2 on df1.CustomerId <= df2.CustomerId

What I have known is that I can do the inner join using merge(df1, df2, by = "CustomerId"). But it is not satisfied the condition of the join.


Solution

  • As I found in comments by dear Grothendieck, one straightforward solution is using sqldf package and get exactly my result in sql format:

    library(sqldf)
    sqldf("SELECT * FROM df1 JOIN df2 on df1.CustomerId <= df2.CustomerId")