Search code examples
algorithmtime-complexitytrie

Word Search with time complexity of O(m) using Trie - m is size of word


I have been attempting an algorithm which runs in O(w) time, where w is the length of a word I am attempting to find in a list of alphabetically ordered words. Space is not a concern. I have found some information on using a Trie to find the word in O(w) time, but I am not sure if this time includes the amount of time required to build the Trie itself? Say I have an array of alphabetically sorted words, S, and I want to find a word w, S has n words, w is of length m. Here is what I have so far:

1. build Trie, T, from S // O(?) time
2. search for w in T // O(m) time

I would like to find a way to keep step 2 in constant time so my total time complexity will be O(m). Is there a way to do this? If so, I only need some guidance on how to set this up. If not, is there another data structure I am forgeting about? Space consumption is not an issue. I can use as much space as needed to get the algorithm to run in O(w), which I can't seem to do, unless I can setup a Trie in constant time.

I have found this post stating the time to create a Trie is O(n*l) where l is the avg length of words in S. This maybe tells me I need to use a different data structure for my solution, but I cannot determine which other data structure type would fit my problem.


Solution

  • Typically, one would create a Trie, or some other data structure like a hash mpap, only once and then re-use it every time you need to find a word. If you're allowed to do that, then you can more or less ignore the cost to create the Trie and concentrate on the time to find a word in that Trie which is, as you note, O(m).

    Note that if you're just "given" a array of alphabetically ordered words somebody, somewhere paid an O(n * m) price to read all of those words off of disk, out of a database, or something and put them into a list. If they had to sort the array they paid an additional cost. Note that you could read all the words off the disk (or out of the DB, or wherever they came from) and into a Trie in the same O(n*m) time so, in some sense, building the Trie is "free" as long as this particularly challenge allows you to build the tree instead of being forced to work with a sorted array.

    If the challenge is that you're given a sorted array of words and a word to find as input, and any time you spend modifying that array "counts", then I think you're out of luck. You can find a word in a sorted array in O(log(n) * w) but you can't do better than that.