I am trying to write a base class for python dataclasse
with a custom hash function as follows. However, when calling the child class's hash
it does not use the custom hash function of the parent class.
import dataclasses
import joblib
@dataclasses.dataclass(frozen=True)
class HashableDataclass:
def __hash__(self):
print("Base class hash was called!")
fields = dataclasses.fields(self)
values = tuple(getattr(self, field.name) for field in fields)
return int(joblib.hash(values), 16)
@dataclasses.dataclass(frozen=True)
class MyDataClass1(HashableDataclass):
field1: int
field2: str
obj1 = MyDataClass1(1, "Hello")
print(hash(obj1))
Is there a way to override hash function of data classes?
You should check the documentation:
If eq and frozen are both true, by default
dataclass()
will generate a__hash__()
method for you. If eq is true and frozen is false,__hash__()
will be set to None, marking it unhashable (which it is, since it is mutable). If eq is false,__hash__()
will be left untouched meaning the__hash__()
method of the superclass will be used (if the superclass is object, this means it will fall back to id-based hashing).
@dataclasses.dataclass(frozen=True, eq=False) # <- HERE
class MyDataClass1(HashableDataclass):
field1: int
field2: str
Output:
>>> obj1 = MyDataClass1(1, "Hello")
Base class hash was called!
1356025966893372872
According the comment of @user2357112, you can/should use (see reasons in comments)
@dataclasses.dataclass(frozen=True)
class MyDataClass1(HashableDataclass):
__hash__ = HashableDataclass.__hash__ # <- HERE
field1: int
field2: str