Search code examples
pythonpython-3.xalgorithmpython-itertools

Find all the words in the list that differ by a single letter


For any given word w in a list words, I want to find all the other words in the list that can become w by changing a single letter in them. All words are of equal length, and only substitution is allowed. Call this function parent(w).

For example, given words = ["hot","dot","dog","lot","log","cog"], parent("cog") would be ["dog", "log"]. parent("lot") would be ["dot", "hot", "log"] etc.

To do this, I first build a reverse index where the keys (str, int) map to the words that have character str at index int. Then, finding the parents of a word becomes the task of intersecting all the words that have the same letters as the word in the same positions, except for one.

The code is as follows, which produces an empty set. Why is it not working?

from typing import Iterator, Dict, Tuple, Set
import itertools

graph: Dict[Tuple[str, int], Set[int]] = dict()

for i, word in enumerate(words):
    for j, ch in enumerate(word):
        if (ch, j) not in graph:
            graph[(ch, j)] = set()

        graph[(ch, j)].add(i)

def parents(word: str) -> Iterator[int]:
    n: int = len(word)
    s: Set[int] = set()
    for part in itertools.combinations(range(n), n - 1):
        keys = map(lambda x: (word[x], x), part)
        existing_keys = filter(lambda k: k in graph, keys)
        for y in itertools.chain(map(lambda k: graph[k], existing_keys)):
            s = s.intersection(set(y)) if s else set(y)

    return filter(lambda i: words[i] != word, s)

print(list(parents("cog"))) # empty!!!

Solution

  • A very simple solution. A different approach.

    Complexity: O(N * 26) => O(N) - where N is the number of characters in each word.

    def main(words, word):
        words = set(words)
        res = []
        for i, _ in enumerate(word):
            for c in 'abcdefghijklmnopqrstuvwxyz':
                w = word[:i] + c + word[i+1:]
                if w != word and w in words:
                    res.append(w)
        return res
    
    
    print(main(["hot","dot","dog","lot","log","cog"], "cog"))
    # ['dog', 'log']
    

    Instead of iterating over all the alphabets, you can also choose to only iterate on the alphabets that are occurring in the list using:

    {letter for w in words for letter in w}