Search code examples
c++hashstlunordered-set

Why is std::unordered_set rehashed even if the load factor limit is not broken?


According to cppreference,

Rehashing occurs only if the new number of elements is greater than max_load_factor()*bucket_count().

In addition, [unord.req]/15 has similar rules:

The insert and emplace members shall not affect the validity of iterators if (N+n) <= z * B, where N is the number of elements in the container prior to the insert operation, n is the number of elements inserted, B is the container's bucket count, and z is the container's maximum load factor.

However, consider the following example:

#include <unordered_set>
#include <iostream>

int main()
{
    std::unordered_set<int> s;
    s.emplace(1);
    s.emplace(42);
    std::cout << s.bucket_count() << ' ';
    std::cout << (3 > s.max_load_factor() * s.bucket_count()) << ' ';
    s.emplace(2);
    std::cout << s.bucket_count() << ' ';
}

With GCC 8.0.1, it outputs

3 0 7

This means after emplacing 2, a rehashing occurs though the new number of elements (3) is not greater than max_load_factor()*bucket_count() (note the second output is 0). Why does this happen?


Solution

  • The rehash condition is changed since Issue 2156. Before the change, a rehash occurs when the new number of elements is no less than max_load_factor()*bucket_count(), and it becomes "greater than" after the change.

    GCC 8.0.1 does not implement this change. There is already a bug report, and it has been fixed in GCC 9.