Search code examples
c++hashunordered-set

Possible to have two keys in an unordered_set, that are considered equal?


From here:

Pred: The unordered_set object uses this expression to determine whether two element keys are equivalent. No two elements in an unordered_set container can have keys that yield true using this predicate.

And also from 23.5.6.1/1:

An unordered_set is an unordered associative container that supports unique keys (an unordered_set contains at most one of each key value) and in which the elements’ keys are the elements themselves. The unordered_set class supports forward iterators.

But I can bypass this if I define my own hasher and equivalent operation that use different features of a class type.

#include <iostream>
#include <unordered_set> //for unordered_set and unordered_multiset

using namespace std;

struct A{
    A() = default;
    A(int a, int b): x(a), y(b) {}
    int x = 0, y = 0;

};

size_t hasher(const A &sd){
    return hash<int>()(sd.x);
}

bool eqOp(const A &lhs, const A &rhs){
    return lhs.y == rhs.y;
}


int main(){

    unordered_set<A, decltype(hasher)*, decltype(eqOp)*> unorderedA(10, hasher, eqOp);

    unorderedA.emplace(2,3);
    unorderedA.emplace(3,3);

    for(const auto& c : unorderedA)
        cout << c.x << " " << c.y << endl;
}

Outputs;

3 3
2 3

I understand what's happening: the hasher places each key in different buckets because of their x value. However those two keys should've been considered equal by my eqOp function due to their y value, but they never get checked for equivalence since they get placed in different buckets. So is it normal functionality that an unordered container can contain two equivalent keys as long as they enter different buckets. Or is it the responsibility of the coder to make sure the hasher and predicate are written to place equivalent keys in the same bucket? Or maybe does my eqOp function violate a rule that the compiler does not detect?

In a set and a map it's clear: the predicate provided to those defines a strict weak ordering and so two keys considered equivalent have only one element associated with them. In an unordered map it's not so clear to me: two equivalent keys can have different elements associated with them so long as those keys are in different buckets. But those sources above feel like they tell me otherwise.


Solution

  • No, it is not possible. Your example is invalid, as standard requires you to return the same hash value for equivalent keys:

    23.2.5 Unordered associative containers

    1. Two values k1 and k2 of type Key are considered equivalent if the container’s key_equal function object returns true when passed those values. If k1 and k2 are equivalent, the hash function shall return the same value for both. [ Note: Thus, when an unordered associative container is instantiated with a non-default Pred parameter it usually needs a non-default Hash parameter as well. — end note ]

    Library user could not ensure values fall into different buckets anyway, since it is not defined how whole all possible hash values are mapped onto bucket indexes or how operations on the map change number of buckets.