Search code examples
stringalgorithmsimilaritylevenshtein-distance

Measure the strings similarities by transposition


If I have three strings, the first one is string1 = Laptop, the second one is string2 = Latpop and the third one is string3 = Lavmop, then the levenshtein distance algorithm will return the same distance for the similarities of string1and string2 and the similarities of string1 and string3. that is because the levenshtein algorithm calculate only the operations: insert, delete and substution, which is not including the transposition operation, for example, we can swap the third and forth character at Latpop string which yields Laptop.

It's obviouse that Latpop is more similar to the Laptop than Lavmop, and it's not correct to classify them in the same similarity level.

Is there an algorithm, that take into account the transposition operation?


Solution

  • I found the answer in Damerau–Levenshtein distance and Jaro–Winkler distance