I want to generate variations on a name by edit distance equal or smaller than a number. I thought the simplest solution was a recursion. If I add the results of the current step after the recursion, the following code works fine; if I add them before, the recursion fails to terminate:
def generate_distance1(name):
res = []
# Deletion.
for i in range(len(name)):
if 0 == i:
res.append(name[1:])
elif len(name) - 1 == i:
res.append(name[:-1])
else:
res.append(name[:i] + name[i+1:])
# Substitution.
for i in range(len(name)):
if 0 == i:
res.append("?" + name[1:])
elif len(name) - 1 == i:
res.append(name[:-1] + "?")
else:
res.append(name[:i] + "?" + name[i+1:])
# Addition
for i in range(len(name) + 1):
if 0 == i:
res.append("?" + name)
elif len(name) == i:
res.append(name + "?")
else:
res.append(name[:i] + "?" + name[i:])
res = list(set(res))
return res
def generate_distance(name, max_distance, before):
if 0 == max_distance:
return [name]
dist1 = generate_distance1(name)
if 1 == max_distance:
return dist1
if before:
# This is not OK.
res = dist1
else:
# This is OK.
res = []
for n in dist1:
res.extend(generate_distance(n, max_distance - 1, before))
if not before:
res.extend(dist1)
return list(set(res))
print(generate_distance("abracadabra", 2, before=False))
print(generate_distance("abracadabra", 2, before=True))
I'm using Python 3.11.6. I think this issue has to do with lexical scoping, and possibly closures, but I cannot understand why; neither can I really formulate a better title for the question. Why does this recursive function fail to terminate?
The problem with the before version is that the for
loop will extend the list that is being iterated:
for n in dist1:
res.extend(generate_distance(n, max_distance - 1, before))
As you had made res
identical to dist1
, the above code really is doing this:
for n in dist1:
dist1.extend(generate_distance(n, max_distance - 1, before))
The loop should not have to iterate over more items than dist1
originally head, and as generate_distance
will always produce a non-empty list, this loop never ends (until a memory limit is hit).
This fix of the before approach is to create a new list for res
(like you also create a new list in the after approach):
res = dist1[:]
Unrelated to your question, but in generate_distance1
you don't need to have if ... elif ... else
blocks. You can apply the generic slicing (in the else
case) also to the two first cases. This is true for the deletion, the update and insert cases.