Search code examples
pythonpython-3.xobjecthashset

hash method implementation not working along set() [Python]


I am implementing a hash function in an object and as my hash value for such object I use the username hashed value, i.e:

class DiscordUser:
    def __init__(self, username):
        self.username = username

    def __hash__(self):
        return hash(self.username)

The problem arises when adding such objects to the hash set and comparing them with the exact same username as input for the constructor, i.e:

user = DiscordUser("Username#123")

if user in users_set:
    # user is already in my users_set, such condition is NEVER MET, dont understand why

else:
    # add user to users_set, this condition is met ALWAYS
    users_set.add(user)

Why the hash funcion is not working as properly, or what im doing wrong here?


Solution

  • The hash function is working properly, set membership uses __hash__(), but if two objects have the same hash, set will use the __eq__() method to determine whether or not they are equal. Ultimately, set guarantees that no two elements are equal, not that no two elements have equal hashes. The hash value is used as a first pass because it is often less expensive to compute than equality.

    Why?

    There is no guarantee that any two objects with the same hash are in fact equal. Consider that there are infinite values for self.name in your DiscordUser. Python uses siphash for hashing str values. Siphash has a finite range, therefore collisions must be possible.

    Be careful about using a mutable value as input to hash(). The hash value of an object is expected to be the same for its lifetime.

    Take a look at this answer for some nice info about sets, hashing, and equality testing in Python.


    edit: Python uses siphash for str values since 3.4