Search code examples
rtextmerge

R - Merging two dataframe by text


I have two datasets which I want to merge :

df1 <- data.frame( title = 
                     c("residence mozart", 
                       "les hesperides auteuil mirabeau",
                       "chaillot",
                       "jouvenet",
                       "retraite  dosne"))
                   
                   
df2 <- data.frame(title = c("terrasses mozart", "chaillot",
                  "villa jules janin", "retraites dosne"))

And I would like to have something like this :

1 residence mozart                  NA (or terrasses mozart)
2 les hesperides auteuil mirabeau   NA
3 chaillot                          chaillot
4 jouvenet                          NA
5 retraite  dosne                   retraites dosne


Here is what I did :

x = data.frame(title_df2 = matrix(ncol = 1, nrow = nrow(df1)))


for (i in nbr){
  x[i, ] <- grep(df1$title[i], df2$title, value = T)
}

It does not work at all ! Even though grep(df1$title[5], df2$title, value = T) works and return "chaillot"!


Solution

  • If I understand correctly

    df1 <- data.frame( title = 
                         c("residence mozart", 
                           "les hesperides auteuil mirabeau",
                           "chaillot",
                           "jouvenet",
                           "retraite  dosne"))
    
    
    df2 <- data.frame(title = c("terrasses mozart", "chaillot",
                                "villa jules janin", "retraites dosne"))
    library(dplyr)
    library(fuzzyjoin)
    
    stringdist_left_join(x = df1, y = df2, method = "jw", distance_col = "d") %>% 
      filter(d < 0.25) %>% 
      right_join(df1, by = c("title.x" = "title"))
    #> Joining by: "title"
    #>                           title.x          title.y          d
    #> 1                residence mozart terrasses mozart 0.23863636
    #> 2                        chaillot         chaillot 0.00000000
    #> 3                 retraite  dosne  retraites dosne 0.09206349
    #> 4 les hesperides auteuil mirabeau             <NA>         NA
    #> 5                        jouvenet             <NA>         NA
    

    Created on 2021-04-19 by the reprex package (v2.0.0)