Search code examples
c++hashc++17hashtable

Why C++ STL hash table (unordered_map) doesn't accept vectors as keys


I assume that there is a reason behind this design choice. Boost seems to have an implementation for it, hence it should be possible to use vectors as hash table keys. Are there any theoretical properties for hash functions applied to arrays that make them more prone to collisions or other undesirable behavior?


Solution

  • You'll notice Boost doesn't actually have an extension to accept a vector<T> as a key specifically - instead it has an extension that lets you use any Iterable - and it generates the hash only as a function of the Iterable's contents...

    This may or may not be desirable, depending on:

    • If you want to use object-identity rather than object-value as the basis for hashing... or not.
    • If you're comfortable with hashing being a non-constant-time operation... or not.
      • Just because boost::hash_range appears to be O(n) doesn't mean the underlying iterable won't take 5 minutes to return all hashable values for each call...
    • If the order of elements does - or doesn't matter.
      • (I believe) using boost::hash_range or boost::hash_combine with one of two distinct but equivalent unordered_set objects will result in different hash-codes despite their value-equivalence.
    • If two conceptually different objects that can iterate over the same values (e.g. a vector<uint8_t> representing a data buffer, or queue<SomeEnum> where SomeEnum : uint8_t representing a queue of values) should have the same hahs-code... or not.

    I suspect the team behind the STL doesn't like the fact that there's so many contextual "if"s described above which would mean it wouldn't be sensible to provide default behaviour and so they require you to always be more explicit with your hash-generation for arbitrary objects (besides, if you want Boost's behaviour, then just use Boost in the first place - it's not like Boost is competing with the STL).

    Also see this QA: C++ unordered_map using a custom class type as the key