Search code examples
algorithmtime-complexitycode-complexity

Word Ladder runtime complexity


I am trying to solve the following problem:

Given a beginWord, say pig, and an endWord, say phi, and a list of words wordList, say [phg, phi]. We are allowed to change one character of beginword each time, and the transformed word needs to be in the wordList. I need an algorithm to find the minimal steps to achieve the transformation. In my example we can do pig -> phg -> phi. Also all the words involved are of the same length.

My solutions is to use BFS:

queue;
level = 0;
queue.add(beginWord);
while(queue is not empty){
    for(i looping from 0 to the size of the queue){
        curr = the first element in the queue;
        if curr is endWord, then returns level + 1;
        for(j starting from 0 to the length of curr){
            for(c starting from a to z){
                replaced = replace the jth character of curr with c;
                if we have never seen replaced before and it is contained in wordList, then add it to the queue.
            }
        }
    }
}
return 0;

Now I am struggling to come up with the runtime complexity of this algorithm. Let us say that the length of the words is $M$, and the length of the wordList is N. My thinking is the following:

For each intermediate word, you have 26**M choices and in the worst case one might need to visit all N words in the wordList, so in total one get N * 26 ** M. However, since we are requiring that all the transformations to be in the wordList, so actually, at each intermediate word, you have at most N-1 choices. Thus, it seems to me that the runtime complexity should be O(N(N - 1)).

Is that correct? I do not think so, because even though at each intermediate word, there is only N-1 possible transformations, one do spend operations to check whether which N-1 ones of them are valid by looping through the whole word and replaced each of its characters by 26 possibilities.

I know that the problem has been asked here. But no answer is accepted there and I do not agree with their analysis.


Solution

  • So as far as the complexity of the algorithm is concerned:

    queue;
    level = 0;
    queue.add(beginWord);
    while(queue is not empty){
        for(i looping from 0 to the size of the queue){
            curr = the first element in the queue;
            if curr is endWord, then returns level + 1;
            for(j starting from 0 to the length of curr){
                for(c starting from a to z){
                    replaced = replace the jth character of curr with c;
                    if we have never seen replaced before and it is contained in wordList, then add it to the queue.
                }
            }
        }
    }
    return 0;
    

    Assuming the innermost loop uses a trie or hashtable for lookups (won't get more efficient than that), the inner condition takes O(n):

    while(queue is not empty){
        for(i looping from 0 to the size of the queue){
            curr = the first element in the queue;
            if curr is endWord, then returns level + 1;
            for(j starting from 0 to the length of curr){
                for(c starting from a to z){
                    O(n)
                }
            }
        }
    }
    

    The next inner most loop iterates 26 times, so still O(n):

    while(queue is not empty){
        for(i looping from 0 to the size of the queue){
            curr = the first element in the queue;
            if curr is endWord, then returns level + 1;
            for(j starting from 0 to the length of curr){
                O(n)
            }
        }
    }
    

    The next loop iterates over the length of the string, so again n steps. The additional steps are either constant-time or linear (O(n)), so we can just ignore them:

    while(queue is not empty){
        for(i looping from 0 to the size of the queue){
            O(n ^ 2)
        }
    }
    

    This for-loop is now fairly useless. We can simply remove it at instead poll the queue directly in the while-loop. The while-loop takes O(m) steps where m is the number of words in the word-list. So in total the time-complexity is O(n^2 * m).