Search code examples
pythonnumpypython-typingmypy

Numpy's `NDArray[np.int_]` not compatible with Python's `Sequence[Integral]`?


Code to reproduce:

from numbers import Integral
from collections.abc import Sequence

import numpy as np
from numpy.typing import NDArray


def f(s: Sequence[Integral]):
    print(s)


def g() -> NDArray[np.int_]:
    return np.asarray([1, 2, 3])


def _main() -> None:
    a = g()
    f(a)


if __name__ == "__main__":
    _main()

On line 18 I get the following Mypy error:

Argument 1 to "f" has incompatible type "ndarray[Any, dtype[signedinteger[Any]]]"; expected "Sequence[Integral]"

What's also weird is that this error doesn't occur if I remove the -> None from the def _main() -> None.

EDIT

It's even worse. Forget about Integer and np.int_. NDArray itself (or indeed np.ndarray class) seems to be incompatible with Sequence, which doesn't make any sense to me. Run the following code:

from collections.abc import Sequence

import numpy as np
from numpy.typing import NDArray


def f(s: Sequence):
    print(s)


def g() -> np.ndarray:
    return np.asarray([1, 2, 3])


def _main() -> None:
    a = g()
    f(a)


if __name__ == "__main__":
    _main()

I get the Mypy error: Argument 1 to "f" has incompatible type "ndarray[Any, Any]"; expected "Sequence[Any]" [arg-type].

EDIT AGAIN My versions are:

  • Python 3.10.1
  • Mypy 1.0.1
  • Numpy 1.24.2

Solution

  • The problem is that NDArray[T] is considered neither a nominal nor a structural subtype of Sequence[T].

    The first is easily explained: np.ndarray does not inherit from collections.abc.Sequence. Therefore it is not a nominal subtype of it.

    The second actually took me a bit of digging, but the underlying issue is that the stubs for the Sequence type do not define it as a protocol. While it is common to talk about the "sequence protocol", there is technically no built-in protocol class for it.

    This is in contrast to something like Iterable for example. That is indeed defined as a protocol, which means that any class (no matter its inheritance) that implements the corresponding __iter__ method will be treated as a structural subtype of Iterable by type checkers.

    See the typeshed source for Iterable vs. that for Sequence. A protocol must have typing.Protocol as one of its immediate base classes. As to why something like Sequence is not made an actual protocol, I found no clear explanation. I found a brief mention in response to this comment, but no real reasons given.

    If we change your example for f to take an Iterable, it will be recognized by mypy as a supertype of NDArray:

    from collections.abc import Iterable
    from numbers import Integral
    
    import numpy as np
    from numpy.typing import NDArray
    
    
    def f(s: Iterable[Integral]) -> None:
        print(s)
    
    
    def g() -> NDArray[np.int_]:
        return np.asarray([1, 2, 3])
    
    
    if __name__ == "__main__":
        f(g())
    

    However this brings us to a tangentially related problem. In terms of actual usefulness you might as well ditch the numbers.Integral here because even using Iterable[str] would not cause an error, if we passed NDArray[np.int_] to it.

    This is due to the fact that ndarray.__iter__ is typed as returning Any, which means all bets are off and any ndarray will match literally any iterable. Why that is the case is again something better to be discussed with the developers.

    You should just be aware that no checks for the item type will ever fail, if you rely on the Iterable protocol for numpy arrays.

    If you really want to have an "actual" sequence-like protocol, you'll need to write one yourself, but I fear this might become difficult for this use case because the numpy type annotations are not trivial and (as far as I know) not comprehensive at all.