Search code examples
pythonmypy

typing recursive class and inheritance


I have the following class hierarchy:

#!/usr/bin/env python3

from typing import List, Optional, Tuple, Type

class Attribute:
    def __init__(self, name: bytes) -> None:
        self._name = name

    @property
    def name(self) -> bytes:
        return self._name

class Element:
    def __init__(self, name: bytes, attributes: Tuple[Type['Attribute'], ...], elements: Tuple['Element', ...]) -> None:
        self._name       = name
        self._elements   = elements
        self._attributes = attributes

    @property
    def name(self) -> bytes:
        return self._name

    @property
    def elements(self) -> Tuple['Element', ...]:
        return self._elements

    @property
    def attributes(self) -> Tuple[Type['Attribute'], ...]:
        return self._attributes

class SubAttribute1(Attribute):
    def __init__(self, name: bytes, field1: bytes) -> None:
        super().__init__(name)
        self._afield1 = field1

class SubElement1(Element):
    def __init__(self, name: bytes, attributes: Tuple[Type[Attribute], ...], elements: Tuple['Element', ...], field1: bytes, field2: bytes) -> None:
        super().__init__(name, attributes, elements)
        self._field1 = field1
        self._field2 = field2
        
if __name__ == '__main__':
    subE  = SubElement1(b'name', None, None, b'', b'')
    subA  = SubAttribute1(b'name', b'field1')
    subE2 = SubElement1(b'name', (subA,), (subE,), b'', b'')
    print(subE2.elements[0]._field1)
    print(subE2.attributes[0]._afield1)
    print(type(subE2.elements[0]))

I subclass the base classes Element and Attribute to add additional fields. The fields 'elements' and 'attributes' should store the derived classes objects respectively. For SubElement1 SubElement1().elements stores a tuple with SubElement1 obejcts. All works fine, but I get the following mypy errors:

question.py:45: error: Argument 2 to "SubElement1" has incompatible type "Tuple[SubAttribute1]"; expected "Tuple[Type[Attribute], ...]"
question.py:46: error: "Element" has no attribute "_field1"
question.py:47: error: "Type[Attribute]" has no attribute "_afield1"

How can I change the code to eliminate the mypy errors?


Solution

  • This question is quite interesting, I thought that PEP646 support is slightly better.

    I assume python 3.10 and most recent released version of specific checker as of now, unless explicitly specified: mypy==0.991; pyre-check==0.9.17; pyright==1.1.281

    Make elements proper

    First of all, here's the (simple enough) code that resolves "elements" issue, but does not help with attributes:

    from typing import Generic, List, Optional, Sequence, Tuple, Type, TypeVar
    
    
    _Self = TypeVar('_Self', bound='Element')
    
    
    class Attribute:
        def __init__(self, name: bytes) -> None:
            self._name = name
    
        @property
        def name(self) -> bytes:
            return self._name
    
    
    class Element:
        def __init__(self: _Self, name: bytes, attributes: tuple[Attribute, ...], elements: Sequence[_Self]) -> None:
            self._name = name
            self._elements = tuple(elements)
            self._attributes = attributes
    
        @property
        def name(self) -> bytes:
            return self._name
    
        @property
        def elements(self: _Self) -> Tuple[_Self, ...]:
            return self._elements
    
        @property
        def attributes(self) -> Tuple[Attribute, ...]:
            return self._attributes
    
    
    class SubAttribute1(Attribute):
        def __init__(self, name: bytes, field1: bytes) -> None:
            super().__init__(name)
            self._afield1 = field1
    
    
    class SubElement1(Element):
        def __init__(self: _Self, name: bytes, attributes: tuple[Attribute, ...], elements: Sequence[_Self], field1: bytes, field2: bytes) -> None:
            super().__init__(name, attributes, elements)
            self._field1 = field1
            self._field2 = field2
    
    
    if __name__ == '__main__':
        subE  = SubElement1(b'name', tuple(), tuple(), b'', b'')
        subA  = SubAttribute1(b'name', b'field1')
        subE2 = SubElement1(b'name', (subA,), (subE,), b'', b'')
        print(subE2.elements[0]._field1)
        print(subE2.attributes[0]._afield1)  # E: "Attribute" has no attribute "_afield1"  [attr-defined]
        print(type(subE2.elements[0]))
    

    This gives one error (commented in source). Here's playground.

    In nearest future (works even on mypy master branch, but not on 0.991) you'll be able to replace _Self with from typing_extensions import Self and skip annotating self argument, like this:

    # import from typing, if python >= 3.11
    from typing_extensions import Self
    
    class Element:
        def __init__(self, name: bytes, attributes: tuple[Attribute, ...], elements: Sequence[Self]) -> None:
            self._name = name
            self._elements = tuple(elements)
            self._attributes = attributes
    

    You can try it here - same 1 error.

    Variadic attributes

    Now you want to preserve attributes types - they can be heterogeneous, thus you need PEP646 to continue. The class becomes generic in unknown amount of variables. pyre and pyright claim to support this (mypy does not, the work is currently in progress). pyre failed to typecheck the solution below, giving a few spurious errors. pyright succeed (though I personally dislike it, so won't recommend switching). Pyright sandbox is unofficial and not up-to-date, and doesn't work here - copy it locally, install and run pyright to verify.

    from typing import Generic, List, Optional, Sequence, Tuple, Type, TypeVar
    from typing_extensions import Unpack, Self, TypeVarTuple
    
    _Ts = TypeVarTuple('_Ts')
    
    
    class Attribute:
        def __init__(self, name: bytes) -> None:
            self._name = name
    
        @property
        def name(self) -> bytes:
            return self._name
    
    class Element(Generic[Unpack[_Ts]]):
        def __init__(self, name: bytes, attributes: tuple[Unpack[_Ts]], elements: Sequence[Self]) -> None:
            self._name = name
            self._elements = tuple(elements)
            self._attributes = attributes
    
        @property
        def name(self) -> bytes:
            return self._name
    
        @property
        def elements(self) -> Tuple[Self, ...]:
            return self._elements
    
        @property
        def attributes(self) -> Tuple[Unpack[_Ts]]:
            return self._attributes
    
    class SubAttribute1(Attribute):
        def __init__(self, name: bytes, field1: bytes) -> None:
            super().__init__(name)
            self._afield1 = field1
    
    class SubElement1(Element[Unpack[_Ts]]):
        def __init__(self, name: bytes, attributes: tuple[Unpack[_Ts]], elements: Sequence[Self], field1: bytes, field2: bytes) -> None:
            super().__init__(name, attributes, elements)
            self._field1 = field1
            self._field2 = field2
            
    if __name__ == '__main__':
        subE  = SubElement1(b'name', tuple(), tuple(), b'', b'')
        subA  = SubAttribute1(b'name', b'field1')
        subE2 = SubElement1(b'name', (subA,), (subE,), b'', b'')
        print(subE2.elements[0]._field1)
        print(subE2.attributes[0]._afield1)
        print(type(subE2.elements[0]))
    

    Pyright says 0 errors, 0 warnings, 0 informations, pyre errors:

    ƛ Found 2 type errors!
    t/t.py:15:14 Undefined or invalid type [11]: Annotation `Unpack` is not defined as a type.
    t/t.py:15:14 Undefined or invalid type [11]: Annotation `_Ts` is not defined as a type.
    

    mypy goes completely crazy even with experimental flags, paste into mypy playground if you want to look at this.

    Homogeneous attributes

    However, if your attributes can be represented by homogeneous sequence (so that, say, SubElement1 instances can contain only SubAttribute1), things are much simpler, and the generic with regular TypeVar is sufficient:

    from typing import Generic, List, Optional, Sequence, Tuple, Type, TypeVar
    
    
    _Self = TypeVar('_Self', bound='Element')
    _A = TypeVar('_A', bound='Attribute')
    
    
    class Attribute:
        def __init__(self, name: bytes) -> None:
            self._name = name
    
        @property
        def name(self) -> bytes:
            return self._name
    
    
    class Element(Generic[_A]):
        def __init__(self: _Self, name: bytes, attributes: Sequence[_A], elements: Sequence[_Self]) -> None:
            self._name = name
            self._elements = tuple(elements)
            self._attributes = tuple(attributes)
    
        @property
        def name(self) -> bytes:
            return self._name
    
        @property
        def elements(self: _Self) -> Tuple[_Self, ...]:
            return self._elements
    
        @property
        def attributes(self) -> Tuple[_A, ...]:
            return self._attributes
    
    
    class SubAttribute1(Attribute):
        def __init__(self, name: bytes, field1: bytes) -> None:
            super().__init__(name)
            self._afield1 = field1
    
    
    class SubElement1(Element[SubAttribute1]):
        def __init__(self: _Self, name: bytes, attributes: Sequence[SubAttribute1], elements: Sequence[_Self], field1: bytes, field2: bytes) -> None:
            super().__init__(name, attributes, elements)
            self._field1 = field1
            self._field2 = field2
    
    
    if __name__ == '__main__':
        subE  = SubElement1(b'name', tuple(), tuple(), b'', b'')
        subA  = SubAttribute1(b'name', b'field1')
        subE2 = SubElement1(b'name', (subA,), (subE,), b'', b'')
        print(subE2.elements[0]._field1)
        print(subE2.attributes[0]._afield1)
        print(type(subE2.elements[0]))
    
    

    And this works.

    Bonus

    All the code you present is called "writing Java in Python" (Citation). You definitely do not need getters with simple attribute access, because you can always add them later. You shouldn't write dataclasses by hands - dataclasses standard module will do it better. So, your example really reduces to much more concise and maintainable python:

    from typing import Generic, Sequence, TypeVar
    from typing_extensions import Self
    from dataclasses import dataclass
    
    
    _A = TypeVar('_A', bound='Attribute')
    
    
    @dataclass
    class Attribute:
        name: bytes
    
    
    @dataclass
    class Element(Generic[_A]):
        name: bytes
        attributes: Sequence[_A]
        elements: Sequence[Self]
    
    
    # OK, if you need different names in constructor signature and class dict
    class SubAttribute1(Attribute):
        def __init__(self, name: bytes, field1: bytes) -> None:
            super().__init__(name)
            self._afield1 = field1
            
            
    # But I'd really prefer
    # @dataclass
    # class SubAttribute1(Attribute):
    #     field1: bytes
    # And adjust calls below to use `field1` instead of `_afield1` - you try to expose it anyway
    
    @dataclass
    class SubElement1(Element[SubAttribute1]):
        field1: bytes
        field2: bytes
    
    
    if __name__ == '__main__':
        subE  = SubElement1(b'name', tuple(), tuple(), b'', b'')
        subA  = SubAttribute1(b'name', b'field1')
        subE2 = SubElement1(b'name', (subA,), (subE,), b'', b'')
        print(subE2.elements[0].field1)
        print(subE2.attributes[0]._afield1)
        print(type(subE2.elements[0]))
    

    ... and it works. Well, will work soon - currently Self is not fully supported in mypy, and checking this results in internal error (crash), reported here by me. Pyright responds with no errors. UPDATE: the error is fixed on mypy master, and the example above typechecks.