Search code examples
c++data-structureshashstring-hashing

Hash map optimised for lookup


I am looking for some map which has fixed keys (fixed during initialization) and that does faster look-up. It may not support adding/updating elements later. Is there some algorithm which looks the list of keys and formulates a function so that it is faster to look-up later. In my case, keys are strings.

Update:

Keys are not known at compile time. But during initialization time of the application. There wont be any further insertions later but there will be lots of look-ups. So I want look-ups to be optimized.


Solution

  • CMPH may be what you're looking for. Basically this is gperf without requiring the set at compile-time.

    Though of course std::unordered_map as by C++11 might just do too, though possibly with a few collisions.

    Since you lookup strings, for strings, a trie (any of the different trie flavours, crit-bit or whatever funky names they have) may also be worthwhile to look into, especially if you have many of them. There are a lot of free trie implementations freely available.
    The advantage of tries is that they can index-compress strings, so they use less memory, which has a higher likelihood of having data in cache. Also the access pattern is less random, which is also cache-friendly. A hash table must store the value plus the hash, and indexes more or less randomly (not randomly, but unpredictably) into memory. A trie/trie-like structure ideally only needs one extra bit that distinguishes a key from its common prefix in each node.

    (Note by the way that O(log(N)) may quite possibly be faster than O(1) in such a case, because big-O does not consider things like that.)