Search code examples
pythoninheritancetype-hintingmypy

mypy and basic inheritance


Got this pretty straight-forward inheritance case.

I read a bunch of mypy documentation but still can't figure out how to properly deal with those basic cases.

It's very standard OOP inheritance to me so I can't imagine mypy doesn't have a clean way to deal with those cases.

from __future__ import annotations
from dataclasses import dataclass


@dataclass
class Parent:
    a: int = 0

    def __add__(self, other: Parent) -> Parent:
        a = self.a + other.a
        return self.__class__(a)


@dataclass
class Child(Parent):
    b: int = 0

    def __add__(self, other: Child) -> Child:
        a = self.a + other.a
        b = self.b + other.b
        return self.__class__(a, b)


obj1 = Child(1)
obj2 = Child(1, 42)
print(obj1 + obj2)

mypy error message:

foo.py:18: error: Argument 1 of "__add__" is incompatible with supertype "Parent"; supertype defines the argument type as "Parent"
foo.py:18: note: This violates the Liskov substitution principle
foo.py:18: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#incompatible-overrides

Versions:

$ python --version
Python 3.10.4
$ mypy --version
mypy 0.971 (compiled: yes)

Solution

  • Judging by your follow-up comment, I assume you were actually interested in a way to have a generalized __add__ method with proper type inference for mypy and other type checkers. Generics, as always, save the day.

    Here is a working example:

    from __future__ import annotations
    from dataclasses import dataclass, fields
    from typing import TypeVar
    
    T = TypeVar("T", bound="Parent")
    
    @dataclass
    class Parent:
        a: int = 0
    
        def __add__(self: T, other: T) -> T:
            return self.__class__(**{
                field.name: getattr(self, field.name) + getattr(other, field.name)
                for field in fields(self.__class__)
            })
    
    class Child(Parent):
        b: int = 0
    
    if __name__ == '__main__':
        c1 = Child(a=1, b=2)
        c2 = Child(a=2, b=3)
        c3 = c1 + c2
        print(c3)
        reveal_type(c3)  # this line is for mypy
    

    Output is of course Child(a=3, b=5) and mypy states:

    note: Revealed type is "[...].Child"
    

    Obviously, this will break once any subclass of Parent introduces a field with a non-addable type.


    As for the subtyping issue, mypy actually tells you everything. You violated the LSP. This is not specific to mypy or even to Python. As @Wombatz said, this is just how types (and subtypes) have to work to be considered reasonable.

    If you have a subtype of a type T, its interface must be a superset of T's interface, never a strict subset.

    PS

    Let me try and expand on the LSP issue a little bit with a slightly altered example.

    Say I have the following parent and child classes:

    from __future__ import annotations
    from dataclasses import dataclass
    
    @dataclass
    class Parent:
        n: float
    
        def __add__(self, other: Parent) -> Parent:
            return Parent(self.n + other.n)
    
    class Child(Parent):
        def round_n(self) -> float:
            return round(self.n, 0)
    

    Now I write a simple function that takes two instances of Parent, adds them and prints the result:

    def foo(obj1: Parent, obj2: Parent) -> None:
        print(obj1 + obj2)
    

    So far so good. No problems here.

    The signature of foo requires both arguments to be instances of Parent, which means they can also be instances of any subclass of Parent. That means they can also be instances of Child.

    Assuming foo is internally type safe, each of the following calls is perfectly safe:

    parent = Parent(1.0)
    child = Child(0.2)
    
    foo(parent, parent)
    foo(parent, child)
    foo(child, parent)
    foo(child, child)
    

    The output obviously being Parent(n=2.0), Parent(n=1.2), Parent(n=1.2), and Parent(n=0.4).

    But what happens, if I now decide to override the __add__ method for the Child class restricting it to accept only other Child instances? Say I write it like this:

    class Child(Parent):
        def __add__(self, other: Child) -> Child:
            return Child(self.round_n() + other.round_n())
    
        def round_n(self) -> float:
            return round(self.n, 0)
    

    Ignoring the inheritance for a moment, this seems entirely reasonable and type safe. We annotated the other argument of __add__ with Child, which tells the type checker that only instances of Child can be added to an instance of Child. That means we can safely call the round_n method on other, because all Child instances have that method.

    Notably, Parent instances do not have the round_n method. But that is fine, since we annotated other in a way that excludes Parent instances.

    But what about our foo function now?

    Remember, it allows both arguments to be Parent instances, which implies Child instances as well. The whole idea behind the LSP comes into play here. We do not want to concern ourselves with specifics that some subclass of Parent may have. We assume that whatever a subclass does, will not break the Parent interface it inherits.

    Specifically we assume that while a subclass may override the __add__ method, nothing about those changes will restrict the way we can call it. Since Parent.__add__ can be called with other being any instance of Parent, we assume that Child.__add__ can also be called with any instance of Parent.

    How could it be otherwise? There can be infinitely many subclasses of Parent and foo can not possibly be expected to verify that each of their interfaces is still compatible with that of Parent. That is what being a subtype should guarantee. That is the Liskov Substitution Principle:

    If foo is correct and it accepts arguments of type Parent, then substituting an argument of subtype Child must not impact the correctness of foo.

    We already established that foo is correct. However if try the same calls as before now (with the altered Child class), one of them will fail:

    foo(child, parent)
    

    It fails (predictably to us) with the following traceback:

    Traceback (most recent call last):
      File "[...].py", line x, in <module>
        foo(child, parent)
      File "[...].py", line y, in foo
        print(obj1 + obj2)
      File "[...].py", line z, in __add__
        return Child(self.round_n() + other.round_n())
    AttributeError: 'Parent' object has no attribute 'round_n'
    

    We are passing a Parent as other to Child.__add__, which does not work.

    I hope this illustrates the issue a bit better.


    What to do now?

    Aside from the suggestions in the linked post, the dirtiest solution would be the following:

    class Child(Parent):
        def __add__(self, other: Parent) -> Child:
            if not isinstance(other, Child):
                raise RuntimeError
            return Child(self.round_n() + other.round_n())
    
        def round_n(self) -> float:
            return round(self.n, 0)
    

    This is technically correct, just not very nice to any user. A user would probably expect that he can actually use the Child.__add__ method with other being a Parent, so in practice you would probably implement some logic to return something reasonable like so:

    class Child(Parent):
        def __add__(self, other: Parent) -> Child:
            if not isinstance(other, Child):
                return Child(self.round_n() + other.n)
            return Child(self.round_n() + other.round_n())
    
        def round_n(self) -> float:
            return round(self.n, 0)
    

    Notice by the way, that restricting the return type to a subtype is no problem. Since this post is way too lang as it is, I'll leave it as an exercise for the reader to deduce why that is the case.