Search code examples
boosthashboost-propertytree

Using boost::hash with boost::property_tree?


I am attempting to generate a hash value for an object that contains a boost::property_tree (boost::property_tree::basic_ptree<std::basic_string<char>, std::basic_string<char> >). Searching through the boost header files for property tree I cannot find any defined hash_value function for it. Basic example of what I am trying to achieve:

class MyClass{
public:
    friend std::size_t hash_value(const MyClass & obj);
private:
    boost::property_tree m_data;
}

inline std::size_t hash_value(const MyClass & obj){
    std::size_t seed = 0;
    boost::hash_combine(seed,obj.m_data);
    return seed;
}

This code will fail to compile with: "no matching function for call to 'hash_value(const boost::property_tree &)'"

My question: Is hash_value defined for boost::property_tree in some header file I haven't found. If not, what is the boost idiomatic way to hash a property_tree by traversing it?

Should I use use ptree serialization to convert to std::string and hash that, or manually traverse the tree and create a recursive hash?


Solution

  • Just specialize hash<> for ptree:

    #include <boost/property_tree/ptree.hpp>
    #include <boost/functional/hash.hpp>
    
    namespace boost {
        template<typename Key, typename Data, typename KeyCompare>
        struct hash<boost::property_tree::basic_ptree<Key, Data, KeyCompare> > {
            size_t operator()(boost::property_tree::basic_ptree<Key, Data, KeyCompare> const& pt) const {
                std::size_t seed = 0;
                boost::hash_combine(seed, pt.template get_value<std::string>());
                boost::hash_range(seed, pt.begin(), pt.end());
                return seed;
            }
        };
    }
    

    That's enough! Here's a small MyClass that reads from a json-like literal:

    #include <boost/property_tree/json_parser.hpp>
    
    class MyClass{
      public:
          MyClass(std::string const& json) {
              std::istringstream iss(json);
              read_json(iss, m_data);
          }
      private:
        boost::property_tree::ptree m_data;
    
        friend inline std::size_t hash_value(const MyClass& obj){
            std::size_t seed = 0;
            boost::hash_combine(seed, obj.m_data);
            return seed;
        }
    };
    

    Now you can test it

    Live On Coliru

    #include <iostream>
    
    int main() {
        for (std::string const data : {
                R"({"a":[1,2,3],"b":{"nest":"hello","more":"world"}})",
                R"({"b":{"nest":"hello","more":"world"},"a":[1,2,3]})",
                R"({ })",
                R"({})",
            })
        {
            MyClass o(data);
            std::cout << "object hash: " << hash_value(o) << " " << data << "\n";
        }
    }
    

    Prints:

    object hash: 3573231694259656572 {"a":[1,2,3],"b":{"nest":"hello","more":"world"}}
    object hash: 11176663460548092204 {"b":{"nest":"hello","more":"world"},"a":[1,2,3]}
    object hash: 3864292196 { }
    object hash: 3864292196 {}
    

    CAUTION: Hash and Equality

    For many containers, hash<> assumes a corresponding equality comparator. If they don't match, you get Undefined Behaviour.

    You might be tempted to define the hash_range in terms of the ordered (associative) interface of ptree:

    boost::hash_range(seed, pt.ordered_begin(), pt.not_found()); // CAUTION
    

    This has the advantage that {"a":1,"b":2} would match {"b":2,"a":1}.

    Don't do this, unless you know what you're doing. Specifically, you need to pass a compatible equality comparator to every container/algorithm that use this hash.

    If you write it like this, and test using a driver like:

    int main() {
        MyClass a{ R"({"a":[1,2,3],"b":{"nest":"hello","more":"world"}})" },
                b{R"({"b":{"nest":"hello","more":"world"},"a":[1,2,3]})" },
                c{R"({ })" },
                d{R"({})" };
    
        for (auto& lhs : {a,b,c,d})
        for (auto& rhs : {a,b,c,d})
        {
            std::cout << "hash: " << hash_value(lhs) << " " << hash_value(rhs) << " - equality: " << std::boolalpha << (lhs==rhs) << "\n";
            if ((hash_value(lhs) == hash_value(rhs)) != (lhs==rhs))
                std::cout << " -- MISMATCH\n";
        }
    }
    

    It will print:

    hash: 10737438301360613971 10737438301360613971 - equality: true
    hash: 10737438301360613971 10737438301360613971 - equality: false
     -- MISMATCH
    hash: 10737438301360613971 3864292196 - equality: false
    hash: 10737438301360613971 3864292196 - equality: false
    hash: 10737438301360613971 10737438301360613971 - equality: false
     -- MISMATCH
    hash: 10737438301360613971 10737438301360613971 - equality: true
    hash: 10737438301360613971 3864292196 - equality: false
    hash: 10737438301360613971 3864292196 - equality: false
    hash: 3864292196 10737438301360613971 - equality: false
    hash: 3864292196 10737438301360613971 - equality: false
    hash: 3864292196 3864292196 - equality: true
    hash: 3864292196 3864292196 - equality: true
    hash: 3864292196 10737438301360613971 - equality: false
    hash: 3864292196 10737438301360613971 - equality: false
    hash: 3864292196 3864292196 - equality: true
    hash: 3864292196 3864292196 - equality: true
    

    The MISMATCH warnings indicate that equality and hash do not agree.

    If you run the test-driver with the original hash (above the fold) it will print:

    hash: 3573231694259656572 3573231694259656572 - equality: true
    hash: 3573231694259656572 11176663460548092204 - equality: false
    hash: 3573231694259656572 3864292196 - equality: false
    hash: 3573231694259656572 3864292196 - equality: false
    hash: 11176663460548092204 3573231694259656572 - equality: false
    hash: 11176663460548092204 11176663460548092204 - equality: true
    hash: 11176663460548092204 3864292196 - equality: false
    hash: 11176663460548092204 3864292196 - equality: false
    hash: 3864292196 3573231694259656572 - equality: false
    hash: 3864292196 11176663460548092204 - equality: false
    hash: 3864292196 3864292196 - equality: true
    hash: 3864292196 3864292196 - equality: true
    hash: 3864292196 3573231694259656572 - equality: false
    hash: 3864292196 11176663460548092204 - equality: false
    hash: 3864292196 3864292196 - equality: true
    hash: 3864292196 3864292196 - equality: true