Consider I have a python class that has a attributes (i.e. a dataclass, pydantic, attrs, django model, ...) that consist of a union, i.e. None and and a state. Now I have a complex checking function that checks some values.
If I use this checking function, I want to tell the type checker, that some of my class attributes are narrowed.
For instance see this simplified example:
import dataclasses
from typing import TypeGuard
@dataclasses.dataclass
class SomeDataClass:
state: tuple[int, int] | None
name: str
# Assume many more data attributes
class SomeDataClassWithSetState(SomeDataClass):
state: tuple[int, int]
def complex_check(data: SomeDataClass) -> TypeGuard[SomeDataClassWithSetState]:
# Assume some complex checks here, for simplicity it is only:
return data.state is not None and data.name.startswith("SPECIAL")
def get_sum(data: SomeDataClass) -> int:
if complex_check(data):
return data.state[0] + data.state[1]
return 0
As seen it is possible to do this with subclasses, which for various reason is not an option for me:
__subclasses__
magic that handles all subclass specially, i.e. creating database for the dataclassesSo is there an option to type narrow a(n) attribute(s) of a class without introducing a solely new class, as proposed here?
TL;DR: You cannot narrow the type of an attribute. You can only narrow the type of an object.
As I already mentioned in my comment, for typing.TypeGuard
to be useful it relies on two distinct types T
and S
. Then, depending on the returned bool
, the type guard function tells the type checker to assume the object to be either T
or S
.
You say, you don't want to have another class/subclass alongside SomeDataClass
for various (vaguely valid) reasons. But if you don't have another type, then TypeGuard
is useless. So that is not the route to take here.
I understand that you want to reduce the type-safety checks like if obj.state is None
because you may need to access the state
attribute in multiple different places in your code. You must have some place in your code, where you create/mutate a SomeDataClass
instance in a way that ensures its state
attribute is not None
. One solution then is to have a getter for that attribute that performs the type-safety check and only ever returns the narrower type or raises an error. I typically do this via @property
for improved readability. Example:
from dataclasses import dataclass
@dataclass
class SomeDataClass:
name: str
optional_state: tuple[int, int] | None = None
@property
def state(self) -> tuple[int, int]:
if self.optional_state is None:
raise RuntimeError("or some other appropriate exception")
return self.optional_state
def set_state(obj: SomeDataClass, value: tuple[int, int]) -> None:
obj.optional_state = value
if __name__ == "__main__":
foo = SomeDataClass(optional_state=(1, 2), name="foo")
bar = SomeDataClass(name="bar")
baz = SomeDataClass(name="baz")
set_state(bar, (2, 3))
print(foo.state)
print(bar.state)
try:
print(baz.state)
except RuntimeError:
print("baz has no state")
I realize you mean there are many more checks happening in complex_check
, but either that function doesn't change the type of data
or it does. If the type remains the same, you need to introduce type-safety for attributes like state
in some other place, which is why I suggest a getter method.
Another option is obviously to have a separate class, which is what is typically done with FastAPI
/Pydantic
/SQLModel
for example and use clever inheritance to reduce code duplication. You mentioned this may cause problems because of subclassing magic. Well, if it does, use the other approach, but I can't think of an example that would cause the problems you mentioned. Maybe you can be more specific and show a case where subclassing would lead to problems.