Search code examples
pythonnumpypython-typing

Type hint for NumPy ndarray dtype?


I would like a function to include a type hint for NumPy ndarray's alongside with its dtype.

With lists, for example, one could do the following...

def foo(bar: List[int]):
   ...

...in order to give a type hint that bar has to be list consisting of int's.

Unfortunately, this syntax throws exceptions for NumPy ndarray:

def foo(bar: np.ndarray[np.bool]):
   ...

> np.ndarray[np.bool]) (...) TypeError: 'type' object is not subscriptable

Is it possible to give dtype-specific type hints for np.ndarray?


Solution

  • Check out data-science-types package.

    pip install data-science-types
    

    MyPy now has access to Numpy, Pandas, and Matplotlib stubs. Allows scenarios like:

    # program.py
    
    import numpy as np
    import pandas as pd
    
    arr1: np.ndarray[np.int64] = np.array([3, 7, 39, -3])  # OK
    arr2: np.ndarray[np.int32] = np.array([3, 7, 39, -3])  # Type error
    
    df: pd.DataFrame = pd.DataFrame({'col1': [1,2,3], 'col2': [4,5,6]}) # OK
    df1: pd.DataFrame = pd.Series([1,2,3]) # error: Incompatible types in assignment (expression has type "Series[int]", variable has type "DataFrame")
    

    Use mypy like normal.

    $ mypy program.py
    

    Usage with function-parameters

    def f(df: pd.DataFrame):
        return df.head()
    
    if __name__ == "__main__":
        x = pd.DataFrame({'col1': [1, 2, 3, 4, 5, 6]})
        print(f(x))
    
    $ mypy program.py
    > Success: no issues found in 1 source file