Search code examples
rusthashmaphashset

How to create a "keyless" HashSet and HashMap in Rust?


The implementations of HashMap and HashSet found in std::collections both store the keys being hashed, and not just the hash. This is useful in most situations, but not all.

I am making sets and maps with quite large data as keys, and I have no need to actually read the keys back, i.e. I'm not using my_map.keys(). In my use case, the keys are just a giant waste of memory.

How do I create a map or a set that does not store the keys, taking a &K instead of an owned K at insertion?


Solution

  • The standard library HashMap needs to store the keys mainly for two reasons: checking for collisions and resizing.

    The only remnant of the hash that is retained is the item's index in the internal array, which may not even be representative of the hash if the hash collided when inserted and the entry ended up in another index. So the key lets you check for equality once the index is computed from the hash.

    For resizing, since the hash is not stored, all hashes need to be recomputed from the keys in order to reinsert all entries into the larger buffer.

    If you can handle collisions yourself or can ignore some amount of collision, you can use an intermediate hash as a key. If the hash is random enough, you could also create the HashMap with a simple Hasher implementation that merely repeats the bytes it is given. How large and random your intermediate hash is will determine how often collisions will happen. If you put an entire sha256 hash as the key, you'll realistically never have collisions.