I was trying to improve my understanding of the implementation of unordered_map
and was surprised by this behavior. Consider this minimal example below.
#include <iostream>
#include <unordered_map>
using namespace std;
template<>
struct std::hash<int*>
{
size_t operator()(int* arr) const
{
cout << "custom hash called" << endl;
return arr[0];
}
};
template <>
struct std::equal_to<int*>
{
bool operator()(const int* lhs, const int* rhs) const
{
std::cout << "call to compare" << std::endl;
return lhs == rhs;
}
};
int main(int argc, char *argv[])
{
int arr1[8] {11,12,13,14,15,16,17,18};
int arr2[8] {1,2,3,4,5,6,7,8};
unordered_map<int*, string> myMap;
myMap.insert(make_pair(arr1, "one"));
myMap.insert({arr2, "two"});
}
I would have expected this output:
custom hash called
custom hash called
The hash for both inserts is unique and therefore no comparison of multiple keys should be required as I understand it (since the bucket should only contain exactly one key). And indeed this is the result when I try it with Clang, GCC and MSVC on godbolt.org. However, when I compile and run this example on a local Mac an additional call to the equal_to call operator happens for the second insert:
custom hash called
custom hash called
call to compare
Tested with
Apple clang version 13.1.6 (clang-1316.0.21.2)
Target: arm64-apple-darwin21.4.0
Thread model: posix
and
Apple clang version 13.1.6 (clang-1316.0.21.2.3)
Target: x86_64-apple-darwin21.4.0
Thread model: posix
In all cases only the C++20 flag was used.
There are basically two cases where the comparator does not need to be applied:
struct Hash {
size_t operator()(int a) const { return a; }
};
struct Equal { ... /* log operator call */ };
std::unordered_map<int, int, Hash, Equal> m;
m.reserve(2);
std::cout << m.bucket(0) << std::endl; // 0
std::cout << m.bucket(1) << std::endl; // 1
m.insert({0, 0});
m.insert({1, 0})
Here, both keys 0 and 1 target different buckets, so there is no comparison with both implementations.
Live demo: https://godbolt.org/z/5jfYv6sba
std::unordered_map<int, int, Hash, Equal> m;
m.reserve(2);
std::cout << m.bucket(0) << std::endl; // 0
std::cout << m.bucket(2) << std::endl; // 0
m.insert({0, 0});
m.insert({2, 0})
Here, both keys target the same bucket (with index 0). With libstdc++, since the hashes are cached and are different, they are compared and there is no reason to additionally compare the entire keys. However, with libc++, hashes are not cached and the keys need to be compared.
Live demo: https://godbolt.org/z/vWK4Ko7Yj