Search code examples
pythonlisthashsethashable

Why is my hashable object not found in a set of that hashable object, the set being an attribute of another object?


I have a recursive relationship between objects of two classes: Foo has a set of Bar objects (in its bars attribute) and every Bar has a list of Foo objects (in its foos attribute). I've implemented is as shown in the MWE below, but the test fails. Why?

According to the Python Glossary, a set can only contain hashable objects and hashable objects:

An object is hashable if it has a hash value which never changes during its lifetime (it needs a __hash__() method), and can be compared to other objects (it needs an __eq__() method).

I don't really know if my MWE below satisfies that the objects have a hash that never changes, because the hash depends on the the other object(s) in the list and set attributes. Is there a way to solve this problem?

Minimal working example:

import unittest


class Test(unittest.TestCase):
    def test_foo_bar(self):
        foo = Foo()
        bar = Bar()
        bar.add_foo(foo)

        print(bar)
        print(foo.bars)
        print(hash(bar))
        for bar in foo.bars:
            print(hash(bar))
        # The previous print lines print the following:
        # <mwe2.Bar object at 0x105ba8080>
        # {<mwe2.Bar object at 0x105ba8080>}
        # -9223096319794529578
        # -9223096319794529578

        # The following assertion is OK
        self.assertTrue(bar in {bar})

        # The following assertion fails with AssertionError: False is not true
        self.assertTrue(bar in foo.bars)



class Foo:
    def __init__(self):
        self.bars = set()

    def __hash__(self) -> int:
        return hash(self.__dict__.values())


class Bar:
    def __init__(self):
        self.foos = list()

    def __hash__(self) -> int:
        return hash(tuple(self.foos))

    def add_foo(self, foo: "Foo"):
        foo.bars.add(self)
        self.foos.append(foo)


if __name__ == '__main__':
    unittest.main()

I use CPython, Python 3.6.x.


Solution

  • The problem is that your Bar's hash changes immediately after you add it to Foo.bars. You can see this if you add some print statements in the add_foo method:

    def add_foo(self, foo: "Foo"):
        foo.bars.add(self)
        print(hash(self))
        self.foos.append(foo)
        print(hash(self))
    

    Output:

    3527539
    957074234
    

    This is because the hash is calculated based on self.foos, so any modification of self.foos also changes the object's hash.

    (Side note: Contrary to what's stated in the question, bar in {bar} evaluates to True as you'd expect. There is no reason why it wouldn't. I suspect that there was some kind of mistake while debugging.)


    A simple way to make the unittest work is to swap the two lines of code in add_foo:

    def add_foo(self, foo: "Foo"):
        self.foos.append(foo)  # append first
        foo.bars.add(self)  # add to the set with the new hash
    

    Output:

    Ran 1 test in 0.000s
    
    OK
    

    However, this isn't a real fix: It won't help if you call add_foo more than once. If you ever call add_foo after adding a Bar object to a set or a dict, you'll get the same problem again.

    I think it's clear that having interdependent classes with inconsistent hashes is a bad design choice. Possible solutions include

    1. Removing the Foo -> Bar -> Foo dependency cycle
    2. Finding a consistent hash method