1) Why do we add 1 on these line?
d[i-1, j] + 1, // deletion
d[i, j-1] + 1, // insertion
The line
if s[i] = t[j] then cost := 0
else cost := 1
should take into account deleted/lower word lengths, or am I missing something?
2) Also, the comments state deletion and insertion. Am I right in thinking that it's checking for deleted characters in both words (the integers j/i representing the length of words), because a lower value will represent deleted characters.
The code used is here (because it is pseudo code and I have no language specific issues, this thread is not in any language category):
http://www.iterasi.net/openviewer.aspx?sqrlitid=z0cloj7xhk-ce0f72v4cjq
1) These lines compute the distance in the case of deletion, in the case of insertion, and the one using "cost" in case of a substitution...
deletion and insertion effectively count as "1" in the distance calculation, hence the +1.
We can believe there was a substitution only if the characters are different hence the "cost=0" if both chars are equal...
The new distance is then the minimum distance between these 3 hypothesis so you don't always add 1 ...
2) if I compute the distance between "FooBar" and "FoBaWhatever" I have some character deletions even if the second string is longer than the first one ...
Of course if the second string is shorter than the second ( FooBar -> FoBa ) I will find some deletions but cannot know in advance where they are...