Search code examples
pythonpython-typingmypypyright

How to infer the type of the first element in an iterable?


How to infer the type of the first element in an iterable in MyPy/Pyright?

Is there any way to annotate the code below to a bit more narrow scoped? Meaning that I want the type checker to assume that the function below returns an object of the type that is contained in the iterable or list.

from typing import Iterable, TypeVar

T = TypeVar('T')

def get_first(iterable: Iterable[T]) -> T | None:
    for item in iterable:
        return item
    return None

For example:

class Bob:
    pass

ll : List[Bob] = [Bob(),  Bob()]  # Create a list that contains objects of type Bob
first_bob = get_first(ll)   # <---  Type checker should infer that the type of first_bob is Bob

I am looking for a solution that works specifically in MyPy/Pyright.


Solution

  • The code in question already satisfies part of your requirements: "...returns an object of the type that is contained in the iterable...". It does, but optional (that or None), because there are no entries in an empty iterable.

    If you want to express "has at least one item", you need a typing construct that supports some sort of a length constraint. There is such a construct in python: tuple. It's the only Sequence that can be non-homogeneous, and the only one that can have fixed length (e.g. you can express "x should be a 2-tuple" statically). Other constructs (Iterable, Sequence, List, any user-defined type not subclassing tuple) do not talk about length, for them length is just __len__ method meaning "there is a way to get some value from len(x) call".

    So you can only guarantee "T" exactly for tuple types by specifying that they have at least one item. Any other iterable will need a fallback to T | None since it can be empty.

    from collections.abc import Iterable
    from typing import TypeVar, overload
    
    T = TypeVar('T')
    
    @overload
    def get_first(iterable: tuple[T, *tuple[object, ...]]) -> T: ...
    @overload
    def get_first(iterable: Iterable[T]) -> T | None: ...
    def get_first(iterable: Iterable[T]) -> T | None:
        for item in iterable:
            return item
        return None
    

    And usage:

    reveal_type(get_first((1,)))  # int
    reveal_type(get_first((1, 'a', 'b')))  # int
    reveal_type(get_first([]))  # Error (unknown type variable)
    t: list[int] = []
    reveal_type(get_first(t))  # int | None
    reveal_type(get_first([1]))  # int | None
    

    mypy playground (from comment, seems offline now - hope it'll recover soon), pyright playground. Note that users will need to pass a real tuple of known length (and not e.g. tuple[int, ...]) - it can be either created as literal or returned from some other function annotated to return an N-tuple.

    Note that this is almost a "fundamental" restriction. Haskell has a very strong type system, supporting a wide variety of type operations, but its head is partial: it has a signature [a] -> a, and will terminate with exception when input is an empty list. Exception means "did not return", so that's fine for the type system - you can do the same in Python (s/return None/raise ValueError), and you won't have a None trouble anymore. If you want to have a "safe" head, you need to define it somehow, probably using Maybe monad (same as Option in Rust, similar to | None in Python).

    Attempts to spell "non-empty collection" exist, but mostly are implemented via a separate type that validates a container at runtime (e.g. in Swift) and wraps it: you can declare your own "NonEmptyList" and accept it, and consumers will have to create one manually somewhere.

    There's a solution for that in TypeScript, but that's likely just because arrays and tuples are so similar there. It still needs explicit checks, but would be very helpful in your specific case by providing an additional overload for such types without undefined.