I'm playing around with Levenshteins Edit Distance algorithm, and I want to extend this to count transpositions -- that is, exchanges of adjacent letters -- as 1 edit. The unmodified algorithm counts insertions, deletes or substitutions needed to reach a certain string from another. For instance, the edit distance from "KITTEN" to "SITTING" is 3. Here's the explanation from Wikipedia:
Following the same method, the edit distance from "CHIAR" to "CHAIR" is 2:
I would like to count this as "1 edit", since I only exchange two adjacent letters. How would I go about to do this?
You need one more case in the algorithm from Wikipedia:
if s[i] = t[j] then
d[i, j] := d[i-1, j-1]
else if i > 0 and j > 0 and s[i] = t[j - 1] and s[i - 1] = t[j] then
d[i, j] := minimum
(
d[i-2, j-2] + 1 // transpose
d[i-1, j] + 1, // deletion
d[i, j-1] + 1, // insertion
d[i-1, j-1] + 1 // substitution
)
else
d[i, j] := minimum
(
d[i-1, j] + 1, // deletion
d[i, j-1] + 1, // insertion
d[i-1, j-1] + 1 // substitution
)