Assuming that we have a set of elements and want to store them in a hash map (for example std::unordered_set
), and each element has a key of type uint64_t
whose value can vary from 0 to its maximum possible value, is it the best choice to use trivial hash function, where a hash value of a key is the key itself? Does it depend on container in use (i.e. Google's sparse hash vs std::unordered_map
from STL)? The probability of appearance of key values is unknown.
If all you have to hash is a uint64_t of any possible value with unknown probabilities, and your output must be a uint64_t, then you don't gain any advantage by changing the value. Just use the key itself.
If you knew something about the distribution of your values or your values were restricted to a smaller range (which is really the same thing as knowing about the distribution), then it could be beneficial to apply a transformation to the key, but this depends on the implementation of the container. You would only benefit by reducing collisions when the table transforms a hash into a bucket index, but that depends both on the table's algorithm and the current/average state of the table (how often each bucket is used).