Search code examples
pythonpython-typingmypy

What is the correct type annotation for a function that returns a Sequence of the same type as one of its inputs?


For the Python code

from typing import TypeVar, Sequence, Iterator

S = TypeVar("S", bound=Sequence)

def chunk(data: S) -> S:
    return data[:]

mypy 0.971 reports the error

simple_8.py:6:12: error: Incompatible return value type (got "Sequence[Any]", expected "S") [return-value]

Where is the type annotation error in this example and what is the correct, precise way to annotate this function?


Solution

  • If you look at the Typeshed definition of Sequence:

    class Sequence(Collection[_T_co], Reversible[_T_co], Generic[_T_co]):
        @overload
        @abstractmethod
        def __getitem__(self, index: int) -> _T_co: ...
        @overload
        @abstractmethod
        def __getitem__(self, index: slice) -> Sequence[_T_co]: ...
    

    you can see that seq[:] is guaranteed to return some Sequence with the same generic type _T_Co as the source, but that it's not necessarily the same type of sequence. Although the built-in sequence types generally have this behaviour, e.g. for list:

        @overload
        def __getitem__(self, __s: slice) -> list[_T]: ...
                                           # ^ concrete list not abstract Sequence
    

    it's not a requirement of the interface.

    As you don't supply the generic type to the Sequence in your TypeVar, it's the default Any, hence the error:

    error: Incompatible return value type (got "Sequence[Any]", expected "S") [return-value]
    

    The slice data[:] gives a Sequence[Any], which might be the same as S but is not required to be. So if you want to support any Sequence, the most precise you can be is:

    from typing import Sequence, TypeVar
    
    T = TypeVar("T")
    
    def chunk(data: Sequence[T]) -> Sequence[T]:
        return data[:]
    

    Playground


    Alternatively, if you define a Protocol with the stricter typing that seq[:] must return the same type:

    from typing import Protocol, TypeVar
    
    T = TypeVar("T")
    
    class StrictSlice(Protocol):
        def __getitem__(self: T, index: slice) -> T: ...
    

    you can then use this as the bound type in S and get the type-preserving behaviour:

    S = TypeVar("S", bound=StrictSlice)
    
    def chunk(data: S) -> S:
        return data[:]
    
    l: list[int] = chunk([1, 2, 3])
    s: str = chunk("Hello, world!")
    

    If you tried this with the Sequence[T], you'd get e.g. expression has type "Sequence[int]", variable has type "List[int]".

    Playground