Search code examples
pythonpython-typingtypeguardstype-narrowing

Type Narrowing of Class Attributes in Python (TypeGuard) without Subclassing


Consider I have a python class that has a attributes (i.e. a dataclass, pydantic, attrs, django model, ...) that consist of a union, i.e. None and and a state. Now I have a complex checking function that checks some values.

If I use this checking function, I want to tell the type checker, that some of my class attributes are narrowed.

For instance see this simplified example:

import dataclasses
from typing import TypeGuard


@dataclasses.dataclass
class SomeDataClass:
    state: tuple[int, int] | None
    name: str
    # Assume many more data attributes


class SomeDataClassWithSetState(SomeDataClass):
    state: tuple[int, int]


def complex_check(data: SomeDataClass) -> TypeGuard[SomeDataClassWithSetState]:
    # Assume some complex checks here, for simplicity it is only:
    return data.state is not None and data.name.startswith("SPECIAL")


def get_sum(data: SomeDataClass) -> int:
    if complex_check(data):
        return data.state[0] + data.state[1]
    return 0

Explore on mypy Playground

As seen it is possible to do this with subclasses, which for various reason is not an option for me:

  • it introduces a lot of duplication
  • some possible libraries used for dataclasses are not happy with being subclasses without side condition
  • there could be some Metaclass or __subclasses__ magic that handles all subclass specially, i.e. creating database for the dataclasses

So is there an option to type narrow a(n) attribute(s) of a class without introducing a solely new class, as proposed here?


Solution

  • TL;DR: You cannot narrow the type of an attribute. You can only narrow the type of an object.

    As I already mentioned in my comment, for typing.TypeGuard to be useful it relies on two distinct types T and S. Then, depending on the returned bool, the type guard function tells the type checker to assume the object to be either T or S.

    You say, you don't want to have another class/subclass alongside SomeDataClass for various (vaguely valid) reasons. But if you don't have another type, then TypeGuard is useless. So that is not the route to take here.

    I understand that you want to reduce the type-safety checks like if obj.state is None because you may need to access the state attribute in multiple different places in your code. You must have some place in your code, where you create/mutate a SomeDataClass instance in a way that ensures its state attribute is not None. One solution then is to have a getter for that attribute that performs the type-safety check and only ever returns the narrower type or raises an error. I typically do this via @property for improved readability. Example:

    from dataclasses import dataclass
    
    
    @dataclass
    class SomeDataClass:
        name: str
        optional_state: tuple[int, int] | None = None
    
        @property
        def state(self) -> tuple[int, int]:
            if self.optional_state is None:
                raise RuntimeError("or some other appropriate exception")
            return self.optional_state
    
    
    def set_state(obj: SomeDataClass, value: tuple[int, int]) -> None:
        obj.optional_state = value
    
    
    if __name__ == "__main__":
        foo = SomeDataClass(optional_state=(1, 2), name="foo")
        bar = SomeDataClass(name="bar")
        baz = SomeDataClass(name="baz")
        set_state(bar, (2, 3))
        print(foo.state)
        print(bar.state)
        try:
            print(baz.state)
        except RuntimeError:
            print("baz has no state")
    

    I realize you mean there are many more checks happening in complex_check, but either that function doesn't change the type of data or it does. If the type remains the same, you need to introduce type-safety for attributes like state in some other place, which is why I suggest a getter method.

    Another option is obviously to have a separate class, which is what is typically done with FastAPI/Pydantic/SQLModel for example and use clever inheritance to reduce code duplication. You mentioned this may cause problems because of subclassing magic. Well, if it does, use the other approach, but I can't think of an example that would cause the problems you mentioned. Maybe you can be more specific and show a case where subclassing would lead to problems.