Search code examples
pythonpython-typingiterable-unpackingpyright

Is it possible to maintain type information when unpacking object attributes?


Imagine I have an object which is an instance of a class such as the following:

@dataclass
class Foo:
    bar: int
    baz: str

I'm using dataclasses for convenience, but in the context of this question, there is no requirement that the class be a dataclass.

Normally, if I want to unpack the attributes of such an object, I must implement __iter__, e.g. as follows:

class Foo:
    ...
    def __iter__(self) -> Iterator[Any]:
        return iter(dataclasses.astuple(self))

bar, baz = Foo(1, "qux")

However, from the perspective of a static type checker like pyright, I've now lost any type information for bar and baz, which it can only infer are of type Any. I could improve slightly by creating the iter tuple parameter manually:

    def __iter__(self) -> Iterator[Union[str, int]]:
        return iter((self.bar, self.baz))

But I still don't have specific types for bar and baz. I can annotate bar and baz and then use dataclasses.astuple directly as follows:

bar: str
baz: int
bar, baz = dataclasses.astuple(Foo(1, "qux"))

but that necessitates less readable multi-level list comprehensions such as

bars: list[int] = [
    bar for bar, _ in [dataclasses.astuple(foo) for foo in [(Foo(1, "qux"))]]
]

and also ties me to dataclasses.

Obviously, none of this is insurmountable. If I want to use a type checker, I can just not use the unpack syntax, but I would really like to if there's a clean way to do it.

An answer that is specific to dataclasses, or better yet, attrs, is acceptable if a general method is not currently possible.


Solution

  • As juanpa.arrivillaga has pointed out, the assignment statements docs indicate that, in the case that the left hand side of an assignment statement is a comma separated list of one or more targets,

    The object must be an iterable with the same number of items as there are targets in the target list, and the items are assigned, from left to right, to the corresponding targets.

    Therefore, if one wants to unpack a bare object, one must necessarily implement __iter__, which will always have a return type of Iterator[Union[...]] or Iterator[SufficientlyGenericSubsumingType] when it includes multiple attribute types. A static type checker, therefore, cannot effectively reason about the specific types of unpacked variables.

    Presumably, when a tuple is on the right hand side of an assignment, even though the language specification indicates that it will be treated as an iterable, a static type checker can still reason effectively about the types of its constituents.

    As such, as juanpa.arrivillaga has also pointed out, a bespoke astuple method which emits a tuple[...] type is probably the best approach if one must unpack attributes, even though it does not avoid the pitfall of multi-level list comprehensions mentioned in the question. In terms of the question, we could now have:

    @dataclass
    class Foo:
        bar: int
        baz: str
    
        def astuple(self) -> tuple[int, str]:
            return self.bar, self.baz
    
    
    bar, baz = Foo(1, "qux").astuple()
    bars = [bar for bar, _ in [foo.astuple() for foo in [(Foo(1, "qux"))]]]
    

    Without any explicit target annotations, provided we're willing to write extra class boilerplate.

    Neither dataclasses's nor attrs's astuple functions return any better than tuple[Any, ...], so the targets must still be separately annotated if we opt to use those.

    However, for list comprehension, are these better than

    bars = [foo.bar for foo in [Foo(1, "qux")]]
    

    ? Probably not, in most cases.

    As a final note, attrs Why not? page mentions, in reference to "why not namedtuples?", that

    Since they are a subclass of tuples, namedtuples have a length and are both iterable and indexable. That’s not what you’d expect from a class and is likely to shadow subtle typo bugs.

    Iterability also implies that it’s easy to accidentally unpack a namedtuple which leads to hard-to-find bugs.

    I'm not sure I totally agree with either of those points, but something to consider for anyone else wanting to go this route.