Search code examples
c++c++11hashconstantsunordered-set

Modify internal structure when implementing std::hash<T>


I'm writing a custom OrderedTree class I want to use as a key to an unordered_set.

I want to do a couple things when hashing the Tree:

  1. calculate the hash lazily and cache it as needed (since this may be an expensive operation),
  2. maybe balance the tree.

Neither of these operations change the semantic equality or hash value of the object, but they do modify some private fields.

Unfortunately, trying to modify any members in OrderedTree while inside std::hash<Tree>::operator() seems to violate const correctness that unordered_set expects.

Can I use my OrderedTree with unordered_set? If so, how?

EDIT:

As per request in the comments, minimal proof of concept:

#include <unordered_set>

std::size_t hash_combine(std::size_t a, std::size_t b) {
  // TODO: Copy from boost source or something
  return 0;
}

struct Node {
  int value;
  Node *left, *right, *parent;

  std::size_t hash(std::size_t seed) const {
    if (left != nullptr)
      seed = left->hash(seed);
    std::hash<int> hasher;
    seed = hash_combine(seed, hasher(value));
    if (right != nullptr)
      seed = right->hash(seed);
    return seed;
  }
};

struct Tree {

  Tree(): hash_(0), root(nullptr) {}

  Node *root;

  std::size_t hash() const {
    if (hash_ == 0 && root != nullptr) {
      hash_ = root->hash(7);
    }
    return hash_;
  }

private:
  std::size_t hash_;
};

namespace std {
  template<>
  struct hash<Tree> {
    std::size_t operator()(const Tree& t) const {
      return t.hash();
    }
  };
}

int main() {
  std::unordered_set<Tree> set;
}

When I try to compile I get:

Sample.cc:31:13: error: cannot assign to non-static data member within const member function 'hash'
      hash_ = root->hash(7);
      ~~~~~ ^
Sample.cc:29:15: note: member function 'Tree::hash' is declared const here
  std::size_t hash() const {
  ~~~~~~~~~~~~^~~~~~~~~~~~

Solution

  • There is a guarantee that std containers will only call const members when doing const or logically const operations. If those const operations are multiple-reader safe, then so is the container; contrawise, if they are not, neither is the container.

    The immutability of the hash value and equality (or < on ordered containers) are the only things you need guarantee in a key type in an associative container. Actual const gives the above multiple-reader guarantee, which can be quite useful. What more, violating it costs you using this in the future, and/or subtle buts when someone does presume const means immutable.

    You could carefully synchonize the write operation internally to keep the multiple-reader guarantee, or you can give it up.

    To violate const, typically you use mutable. A const method that uses casting to bypass const risks Undefined Behaviour if the object was actually const, and not just a const view of a non-const object.

    In general, be careful before using this kind of optimizaton; it can easily increase code complexity (hance bugs, maintenance, etc) more than it adds speed. And speeding up code is fungible: make sure you identify this as slow code and this part as a bottlenecm prior to investing in it. And if you are going to balance in hash, why wait for hash? Balance before insert!