Search code examples
pythonpandasmypy

Type annotation/hint for index in pandas.DataFrame.iterrows()


I am trying to add type annotations/hints in a Python script for running mypy checks. I have a pandas.DataFrame object, which I iterate like this:

someTable: pandas.DataFrame = pandas.DataFrame()

# ...
# adding some data to someTable
# ...

for index, row in someTable.iterrows():
    #reveal_type(index)
    print(type(index))
    print(index + 1)

If I run this script, here's what I get:

$ python ./some.py
<class 'int'>
2
<class 'int'>
3

And if I check it with mypy, then it reports errors:

$ mypy ./some.py
some.py:32: note: Revealed type is "Union[typing.Hashable, None]"
some.py:34: error: Unsupported operand types for + ("Hashable" and "int")
some.py:34: error: Unsupported operand types for + ("None" and "int")
some.py:34: note: Left operand is of type "Optional[Hashable]"
Found 2 errors in 1 file (checked 1 source file)

As I understand, mypy sees the index as Union[typing.Hashable, None], which is not int, and so index + 1 looks like an error to it. How and where should I then annotate/hint it to satisfy mypy?

I tried this:

index: int
for index, row in someTable.iterrows():
    # ...

but that results in:

$ mypy ./some.py
some.py:32: error: Incompatible types in assignment (expression has type "Optional[Hashable]", variable has type "int")
Found 1 error in 1 file (checked 1 source file)

Solution

  • You could hint index as Optional[int], but then x + 1 won't type check.

    I'm not sure where Union[typing.Hashable, None] comes from; iterrows itself returns an Iterable[tuple[Hashable, Series]]. But it seems like you can safely assert that if index is assigned a value, then it will not be None.

    index: Optional[int]
    for index, row in someTable.iterrows():
        index = typing.cast(int, index)
        print(index + 1)
    

    (Is the Union supposed to reflect the possibility of the iterable raising StopIteration? That doesn't seem right, as a function that raises an exception doesn't return None; it doesn't return at all.)