Search code examples
c++stringdata-structuresmultimap

I need a slightly different multimap


I'm looking for a C++ container class, that's a lot like a multimap, but slightly different. The container would store pairs of strings. But when I retrieve items from the container using key K, I want to find all the items where K begins with the item's own key.

E.G. If I use key "abcde" I want to find items with key "adc" and "abcde", but not "abcqz".

Or in pseudo C++ form:

multimap2<string, string>  myMultiMap;
myMultiMap.insert( pair("abcde", "hello"));
myMultiMap.insert( pair("abc",   "Hi"));
myMultiMap.insert( pair("abcqz", "goodbye"));

// prints 2
cout << myMultiMap.count("abcde") << endl;

// prints "hello"  and  "Hi"
cout << myMultiMap.everything_which_matches("abcde") << endl;

// prints "Hi"
cout << myMultiMap.everything_which_matches("abc") << endl;

// prints "goodbye"
cout << myMultiMap.everything_which_matches("abcqz") << endl;

Insertion time is unimportant, but I need fast access to the items. Is it possible to do this with a normal Multimap by creating a special < operator? My hunch is that I would need the normal < operator for insertion, and a special one for retrieval.

thanks

Hugo


Solution

  • I would suggest using a trie.

    Basically you have a tree with 1 node per unique character. Your algorithm would be O(m) for both lookups and insertion, where m is the length of a string.

    So following your example with:

    "abcde", "hello" 
     "abc",  "Hi"
    "abcqz", "goodbye"
    

    Then you would have the following trie:

           a
           |
           b
           |
           c       (c holds data of hi)
         /  \
        d    q
        |    |
        e    z (z holds data of goodbye)    (e holds data of hello)
    

    To do a lookup you simply start at the root node (root node not shown above), and follow the next char in your input string. Each time you reach a node that has a data result, you will include that as one of your output strings.

    So a search for abcde would give you: "hi", "hello" as you wanted. It would not give you "goodbye" because you did not traverse over that result node.