Search code examples
c++stringmatchprefix

Efficient structure to get all prefixes of a string that match a previously defined set of strings


Suppose that I have a set of prefixes such as ["a", "ab", "ba", "aba", "bbb"].

I also have a list of words. For each word in this list, for example "abad", I want to obtain efficiently all the prefixes of this word that match a word in the set of prefixes. In this example I would like as output ["a","ab", "aba"].

I want to only use structures from the std.

I am thinking about some kind of tree structure, but I can't think of a way of implementing it relatively effortlessly.


Solution

  • A simple implementation would be a hash table using std::unordered_set, and then compare each prefix of each word of the list with the set.

    #include <iostream>
    #include <string>
    #include <unordered_set>
    #include <vector>
    
    int main ()
    {
        std::unordered_set<std::string> myPrefixes = {"a", "ab", "ba", "aba", "bbb"};
    
        std::vector<std::string> listOfWords;
    
        for (int i = 0; i < listOfWords.size(); i++) {
            std::vector<std::string> result;
            std::string word = listOfWords[i];
            for (int j = 0; j < word.length() - 1; j++) {
                if (myPrefixes.count(word.substr(0, word.length - j)) > 0) {
                    result.push_back(word.substr(0, word.length - j);
                }
            }
            // print the result vector or do whatever with it
        }
        return 0;
    }