Search code examples
pythonalgorithmpalindrome

Finding closest palindrome to string


I am trying to write a python program to find the closest palindrome to a word. I can add a letter to any part of the string, remove a letter from any part of the string, or change a letter in any part of a string. I have been looking at using the levenshtein distance to find the minimum number of edits needed between two words in the post Edit Distance in Python. But I am not sure how to programmatically find the palindrome that requires the smallest number of edits. Some examples for what I am trying to achieve:

palindrome('hello') = 'ollo'
#you can remove the h and then turn the e into an o, giving a palindrome in 2 steps
levenshtein('hello',palindrome('hello')) = 2

palindrome('test') = 'tet'
#you can remove the s to get a palindrome
levenshtein('test',palindrome('test')) = 1

palindrome('tart') = 'trart'
#adding an r or removing the r produces a palindrome, but both solutions only require 1 edit so either would be acceptable.
levenshtein('tart',palindrome('tart')) = 1

I was able to use the levenshtein code from the linked post to find the distance between two strings. I need help writing a palindrome() function that takes a string and returns the closest palindrome to that string.


Solution

  • Well here's my DFS implementation. It only searches up to a distance of word length-2, because past that it becomes kind of trivial (drop all but one letter, change every letter to the same).

    It finds all palindromes up to that distance limit, and sorts them by distance.

    import time
    word = "hello"
    visited_cost = {}       # to keep track of non-palindromes we've considered
    palindrome_cost = {}    # to store actual palindromes we've found
    
    
    def find_palindrome(w, dist=0 ):
        # Don't go on forever
        if len(w) == 0 or len(w) > len(word) + 2 or dist > len(word) - 2:
            return []
    
        # Don't retry potential palindromes that we've tried before
        global visited_cost
        if w in visited_cost:
            if dist >= visited_cost[w]:
                return []
        visited_cost[w] = dist
    
        # check if we've found a palindrome
        if (reverse(w)) == w:
            if w in palindrome_cost:
                palindrome_cost[w] = min(palindrome_cost[w], dist)
            else:
                palindrome_cost[w] = dist
            return [w]
    
        palindromes = []
        if len(w) > 1:
            for x in drop_one(w):
                palindromes += find_palindrome(x, dist+1)
            for x in add_one(w):
                palindromes += find_palindrome(x, dist+1)
            for x in change_one(w):
                palindromes += find_palindrome(x, dist+1)
            return palindromes
    
    
    # For a word w, gives all possible words obtained by dropping one letter
    def drop_one(w):
        return [w[:i]+w[i+1:] for i in range(len(w))]
    
    
    # For a word w, gives all possible words obtained by inserting a capital X
    # at any position in the word. Of course "X" could be any letter
    def add_one(w):
        return [w[:i]+"X"+ w[i:] for i in range(len(w))]
    
    
    # For a word w, gives all possible words obtained by changing one letter 
    # to another letter occurring in the word
    def change_one(w):
        return [w[:i] +j +  w[i + 1:] for i in range(len(w)) for j in w]
    
    
    def reverse(s):
        return "".join(reversed(s))
    
    t0 = time.time()
    results = set(find_palindrome(word))
    s = sorted(results, key = lambda x: palindrome_cost[x])
    
    for x in s:
        print x, palindrome_cost[x]
    print "Found %i palindromes based on '%s' in %.3f seconds" %\
           (len(results), word, time.time() - t0)
    

    Output:

    ollo 2
    olllo 2
    elle 2
    heleh 2
    hllh 2
    oeleo 2
    hlllh 2
    [...]
    Found 46 palindromes based on 'hello' in 0.065 seconds