Search code examples

stringdist_join results in NAs

i am experimenting with the stringdist package in order to make fuzzy joins and i run into a problem which i do not understand and fail to find an answer for. I want to join these 2 data tables with the "dl" method and it produces a NA, which i completely do not understand. Maybe one of you has an explanation for this. The code:

x <- stringdist_join(test1, test2, by = "label", mode = "left", distance_col="distance", method="dl") 

if i use the jaccard method however, there is a match:

y <- stringdist_join(test1, test2, by = "label", mode = "left", distance_col="distance", method="jaccard", q=4) 

Hope anyone can clarify.

Cheers Dome


  • max_dist is set to 2 by default.

    The dl distance between "tekniker" and "technician" is more than 2.

    so there's no match.

    stringdist_join(test1, test2, by = "label", mode = "left", distance_col="distance", method="dl",max_dist=5)
    #     label.x label.y distance
    # 1 techniker  techni        3