In a std::unordered_map
hash, the key is stored so it can find the value in O(1).
But what if a hash has two different keys, for example, and the key is a string, and they're equal to each other?
Can it cause a problem?
This is known as a collision. It is also the reason why we talk about "buckets" when talking about hash maps, because it is perfectly normal to have multiple keys with the same hash end up in the same "bucket".
This means that lookup is always a two-stage process:
Note also that even keys that have different hashes can still end up in the same bucket, because hashes are typically larger than the size of the table, so the hash table further reduces the hash to a smaller size, e.g. by taking it modulo the size of the table.
There are multiple different strategies to deal with it, the two most popular are:
1 * hash2(key)
th, 2 * hash2(key)
th, 3 * hash2(key)
th, 4 * hash2(key)
th bucket, and so on.In both cases, you will need to periodically re-size the hash table. In the second case, simply because at some point, it will be full. In both the first and the second case, to make the buckets smaller and the search faster.
If you use exponential resizing, you will typically still have an amortized worst-case step complexity of O(1) for insertion, deletion, and lookup.