Got this pretty straight-forward inheritance case.
I read a bunch of mypy
documentation but still can't figure out how to properly deal with those basic cases.
It's very standard OOP inheritance to me so I can't imagine mypy
doesn't have a clean way to deal with those cases.
from __future__ import annotations
from dataclasses import dataclass
@dataclass
class Parent:
a: int = 0
def __add__(self, other: Parent) -> Parent:
a = self.a + other.a
return self.__class__(a)
@dataclass
class Child(Parent):
b: int = 0
def __add__(self, other: Child) -> Child:
a = self.a + other.a
b = self.b + other.b
return self.__class__(a, b)
obj1 = Child(1)
obj2 = Child(1, 42)
print(obj1 + obj2)
mypy
error message:
foo.py:18: error: Argument 1 of "__add__" is incompatible with supertype "Parent"; supertype defines the argument type as "Parent"
foo.py:18: note: This violates the Liskov substitution principle
foo.py:18: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#incompatible-overrides
Versions:
$ python --version
Python 3.10.4
$ mypy --version
mypy 0.971 (compiled: yes)
Judging by your follow-up comment, I assume you were actually interested in a way to have a generalized __add__
method with proper type inference for mypy
and other type checkers. Generics, as always, save the day.
Here is a working example:
from __future__ import annotations
from dataclasses import dataclass, fields
from typing import TypeVar
T = TypeVar("T", bound="Parent")
@dataclass
class Parent:
a: int = 0
def __add__(self: T, other: T) -> T:
return self.__class__(**{
field.name: getattr(self, field.name) + getattr(other, field.name)
for field in fields(self.__class__)
})
class Child(Parent):
b: int = 0
if __name__ == '__main__':
c1 = Child(a=1, b=2)
c2 = Child(a=2, b=3)
c3 = c1 + c2
print(c3)
reveal_type(c3) # this line is for mypy
Output is of course Child(a=3, b=5)
and mypy
states:
note: Revealed type is "[...].Child"
Obviously, this will break once any subclass of Parent
introduces a field with a non-addable type.
As for the subtyping issue, mypy
actually tells you everything. You violated the LSP. This is not specific to mypy
or even to Python. As @Wombatz said, this is just how types (and subtypes) have to work to be considered reasonable.
If you have a subtype of a type T
, its interface must be a superset of T
's interface, never a strict subset.
Let me try and expand on the LSP issue a little bit with a slightly altered example.
Say I have the following parent and child classes:
from __future__ import annotations
from dataclasses import dataclass
@dataclass
class Parent:
n: float
def __add__(self, other: Parent) -> Parent:
return Parent(self.n + other.n)
class Child(Parent):
def round_n(self) -> float:
return round(self.n, 0)
Now I write a simple function that takes two instances of Parent
, adds them and prints the result:
def foo(obj1: Parent, obj2: Parent) -> None:
print(obj1 + obj2)
So far so good. No problems here.
The signature of foo
requires both arguments to be instances of Parent
, which means they can also be instances of any subclass of Parent
. That means they can also be instances of Child
.
Assuming foo
is internally type safe, each of the following calls is perfectly safe:
parent = Parent(1.0)
child = Child(0.2)
foo(parent, parent)
foo(parent, child)
foo(child, parent)
foo(child, child)
The output obviously being Parent(n=2.0)
, Parent(n=1.2)
, Parent(n=1.2)
, and Parent(n=0.4)
.
But what happens, if I now decide to override the __add__
method for the Child
class restricting it to accept only other Child
instances? Say I write it like this:
class Child(Parent):
def __add__(self, other: Child) -> Child:
return Child(self.round_n() + other.round_n())
def round_n(self) -> float:
return round(self.n, 0)
Ignoring the inheritance for a moment, this seems entirely reasonable and type safe. We annotated the other
argument of __add__
with Child
, which tells the type checker that only instances of Child
can be added to an instance of Child
. That means we can safely call the round_n
method on other
, because all Child
instances have that method.
Notably, Parent
instances do not have the round_n
method. But that is fine, since we annotated other
in a way that excludes Parent
instances.
But what about our foo
function now?
Remember, it allows both arguments to be Parent
instances, which implies Child
instances as well. The whole idea behind the LSP comes into play here. We do not want to concern ourselves with specifics that some subclass of Parent
may have. We assume that whatever a subclass does, will not break the Parent
interface it inherits.
Specifically we assume that while a subclass may override the __add__
method, nothing about those changes will restrict the way we can call it. Since Parent.__add__
can be called with other
being any instance of Parent
, we assume that Child.__add__
can also be called with any instance of Parent
.
How could it be otherwise? There can be infinitely many subclasses of Parent
and foo
can not possibly be expected to verify that each of their interfaces is still compatible with that of Parent
. That is what being a subtype should guarantee. That is the Liskov Substitution Principle:
If foo
is correct and it accepts arguments of type Parent
, then substituting an argument of subtype Child
must not impact the correctness of foo
.
We already established that foo
is correct. However if try the same calls as before now (with the altered Child
class), one of them will fail:
foo(child, parent)
It fails (predictably to us) with the following traceback:
Traceback (most recent call last):
File "[...].py", line x, in <module>
foo(child, parent)
File "[...].py", line y, in foo
print(obj1 + obj2)
File "[...].py", line z, in __add__
return Child(self.round_n() + other.round_n())
AttributeError: 'Parent' object has no attribute 'round_n'
We are passing a Parent
as other
to Child.__add__
, which does not work.
I hope this illustrates the issue a bit better.
What to do now?
Aside from the suggestions in the linked post, the dirtiest solution would be the following:
class Child(Parent):
def __add__(self, other: Parent) -> Child:
if not isinstance(other, Child):
raise RuntimeError
return Child(self.round_n() + other.round_n())
def round_n(self) -> float:
return round(self.n, 0)
This is technically correct, just not very nice to any user. A user would probably expect that he can actually use the Child.__add__
method with other
being a Parent
, so in practice you would probably implement some logic to return something reasonable like so:
class Child(Parent):
def __add__(self, other: Parent) -> Child:
if not isinstance(other, Child):
return Child(self.round_n() + other.n)
return Child(self.round_n() + other.round_n())
def round_n(self) -> float:
return round(self.n, 0)
Notice by the way, that restricting the return type to a subtype is no problem. Since this post is way too lang as it is, I'll leave it as an exercise for the reader to deduce why that is the case.