Suppose there are two data frames likes the following (given from this post):
df1 = data.frame(CustomerId = c(1:6), Product = c(rep("Toaster", 3), rep("Radio", 3)))
df2 = data.frame(CustomerId = c(2, 4, 6), State = c(rep("Alabama", 2), rep("Ohio", 1)))
df1
# CustomerId Product
# 1 Toaster
# 2 Toaster
# 3 Toaster
# 4 Radio
# 5 Radio
# 6 Radio
df2
# CustomerId State
# 2 Alabama
# 4 Alabama
# 6 Ohio
The question is how can I do the following sql query in R:
SELECT * FROM df1 JOIN df2 on df1.CustomerId <= df2.CustomerId
What I have known is that I can do the inner join using merge(df1, df2, by = "CustomerId")
. But it is not satisfied the condition of the join.
As I found in comments by dear Grothendieck, one straightforward solution is using sqldf package and get exactly my result in sql format:
library(sqldf)
sqldf("SELECT * FROM df1 JOIN df2 on df1.CustomerId <= df2.CustomerId")