Search code examples
pythonrecursionlexical-scope

Recursive function fails depending on lexical scoping


I want to generate variations on a name by edit distance equal or smaller than a number. I thought the simplest solution was a recursion. If I add the results of the current step after the recursion, the following code works fine; if I add them before, the recursion fails to terminate:

def generate_distance1(name):

    res = []
    
    # Deletion.
    for i in range(len(name)):
        if 0 == i:
            res.append(name[1:])
        elif len(name) - 1 == i:
            res.append(name[:-1])
        else:
            res.append(name[:i] + name[i+1:])

    # Substitution.
    for i in range(len(name)):
        if 0 == i:
            res.append("?" + name[1:])
        elif len(name) - 1 == i:
            res.append(name[:-1] + "?")
        else:
            res.append(name[:i] + "?" + name[i+1:])

    # Addition
    for i in range(len(name) + 1):
        if 0 == i:
            res.append("?" + name)
        elif len(name) == i:
            res.append(name + "?")
        else:
            res.append(name[:i] + "?" + name[i:])

    res = list(set(res))
    
    return res


def generate_distance(name, max_distance, before):
    if 0 == max_distance:
        return [name]

    dist1 = generate_distance1(name)
    if 1 == max_distance:
        return dist1

    if before:
        # This is not OK.
        res = dist1
    else:
        # This is OK.
        res = []

    for n in dist1:
        res.extend(generate_distance(n, max_distance - 1, before))

    if not before:
        res.extend(dist1)
        
    return list(set(res))

print(generate_distance("abracadabra", 2, before=False))

print(generate_distance("abracadabra", 2, before=True))

I'm using Python 3.11.6. I think this issue has to do with lexical scoping, and possibly closures, but I cannot understand why; neither can I really formulate a better title for the question. Why does this recursive function fail to terminate?


Solution

  • The problem with the before version is that the for loop will extend the list that is being iterated:

        for n in dist1:
            res.extend(generate_distance(n, max_distance - 1, before))
    

    As you had made res identical to dist1, the above code really is doing this:

        for n in dist1:
            dist1.extend(generate_distance(n, max_distance - 1, before))
    

    The loop should not have to iterate over more items than dist1 originally head, and as generate_distance will always produce a non-empty list, this loop never ends (until a memory limit is hit).

    This fix of the before approach is to create a new list for res (like you also create a new list in the after approach):

    res = dist1[:]
    

    Unrelated to your question, but in generate_distance1 you don't need to have if ... elif ... else blocks. You can apply the generic slicing (in the else case) also to the two first cases. This is true for the deletion, the update and insert cases.