Search code examples
c++hashmaphashsetstdmapstdset

Implement unordered_map, unordered_set, map, set using vectors


This is more of an intellectual exercise, but are there battle-tested C++ libraries implementing hash map/set (std::unordered_map, std::unordered_set), red-back trees (std::map, std::set) using std::vectors?

Take std::unordered_set for instance. My idea is to use 3 vectors, X (stores heads of chains), Y (stores set elements), Z (temporary container).

Roughly speaking, given a set element 12345,

  1. Let i = X[hash(12345) % X.size()]. Then Y[i] is the head of the chain that 12345 lives on.

  2. Y[i] is a pair (val, j). val is some element value. Y[j] is the next item on chain.

  3. Once 12345 is found, deleting it can leave a "hole" in Y.

  4. A new element will be pushed back to Y and X will be adjusted accordingly.

  5. If the number of "holes" in Y exceeds, e.g. 50% of Y.size(), adjust X and Y globally to erase all the "holes", during which Z might be needed as a temporary storage.

The idea applies to trees in std::set and std::map. Of course many other details need to be carefully taken care of.

Has anybody tried something like this? The motivation is to keep the data structure as compact as possible, and to avoid memory allocations as much as possible. I imagine this will yield some good speedup for small and medium size applications -- or maybe I am wrong?

Thanks!


Solution

  • Yes, there are. Google dense_hash_map is one of such example.

    There is an immense variety of hash maps and tables built with purpose-specific requirements like cache locality, size, read speed, write speed. As speed is highly dependent on cache locality, it is very common for these implementations to use vectors as backend storage.

    Have a look at this shootout between hashmaps and browse through every one of them.