Search code examples
pythongenericspython-typing

Typehint the type of a collection and the collection itself


Say I want to collect and iterable into another type in the collect method:

from typing import Generic, TypeVar, Collection
from dataclasses import dataclass
T = TypeVar("T")

@dataclass
class Example(Generic[T]):
    data: list[T]
    
    def collect(self, collector: type[Collection]) -> Collection[T]:
        return collector(self.data)

If implemented like that, the typehinting information of which collector is used is lost:

# result should type-hint 'set[int]'
# but instead shows 'Collection[int]'
# The collector type is lost..
result = Example([1, 2]).collect(set)

How can I keep both the type of collection and the type of what is held by the collection, while keeping it all generic?


Solution

  • Instead of type[Collection], use Callable to specify the signature and take advantage of how the return type can be inferred:

    (playgrounds: mypy, Pyright)

    @dataclass
    class Example(Generic[T]):
        data: list[T]
        
        def collect(self, collector: Callable[[list[T]], C]) -> C:
            return collector(self.data)
    
    example = Example([1, 2])
    
    reveal_type(example.collect(set))       # mypy & pyright => set[int]
    reveal_type(example.collect(tuple))     # mypy & pyright => tuple[int, ...]
    reveal_type(example.collect(list))      # mypy & pyright => list[int]
    reveal_type(example.collect(Example))   # mypy & pyright => Example[int]
    

    This also allows functions to be passed as collector, but this minor detail should not make a difference, since functions and classes are both Callable:

    reveal_type(example.collect(lambda elements: list(elements)))
    # mypy & pyright => list[int]
    

    This works because set, tuple, list and Example are all assignable to Callable[[list[int]], C]. From the typeshed library:

    class set(MutableSet[_T]):
        @overload
        def __init__(self) -> None: ...
        @overload
        def __init__(self, __iterable: Iterable[_T]) -> None: ...
    

    The second overload says that a set[_T] may be constructed by passing an Iterable[_T] to the class. list[int] is assignable to Iterable[_T], so type checkers infer _T as int and C as set[int]. The same applies to other callables I mentioned above.

    Another solution is to define __iter__() and let set() et al. be called directly, if collect() doesn't do anything further than passing .data:

    (playgrounds: mypy, Pyright)

    @dataclass
    class Example(Generic[T]):
        data: list[T]
        
        def __iter__(self) -> Iterator[T]:  # Or Generator[T, None, None]
            yield from self.data
    
    example = Example([1, 2])
    
    reveal_type(set(example))    # mypy & pyright => set[int]
    reveal_type(tuple(example))  # mypy & pyright => tuple[int, ...]
    reveal_type(list(example))   # mypy & pyright => list[int]
    

    Example(example) won't work since Example's __init__() expects a list, not an Iterator; you would need Example(list(example)).