I took over a code base (support down to 3.9) and wanted to add some type hinting. However I am currently stuck at this function.
def _pairwise(iterable: T.Iterable, end=None) -> T.Iterable:
left, right = itertools.tee(iterable)
next(right, None)
return itertools.zip_longest(left, right, fillvalue=end)
Which is later used to iterate over regex matches and extract their start and end indices for slicing. The last fill value of None
is used to have the last slice go to the end of the string.
We know that the actual signature should be
_pairwise(iterable: Iterable[T], end: Optional[T] = None) -> Iterator[tuple[T, Optional[T]]]
because left
is guaranteed to be at least as long as right
.
However the approach with the zip_longest
does not allow that. Type checkers read that as Iterator[Optional[T], Optional[T]]
.
I have rewritten the function so that the type checker (pyright) is able to verify that target signature.
def _pairwise(
iterable: Iterable[T], end: Optional[T] = None
) -> Iterator[tuple[T, Optional[T]]]:
left, right = itertools.tee(iterable)
next(right, None)
for x, y in zip(right, left):
yield y, x
if (last := next(left, None)) is not None:
yield last, end
However I am not particularly pleased with this result yet. First the need to swap the arguments to zip
to avoid it taking one extra step on left
as well as the manual check to deal with the case of the argument being an empty iterable.
This also means that the functionality is not the intended one for T=NoneType
although that is not actually a problem, but it does annoy me a bit.
Is there any other way to get this pairwise functionality to typecheck?
Your zip_longest
solution looks clean to me, and # type: ignore[return-value]
would be a good fit there, probably with a short explanatory comment.
However, to make your code typecheck, you could use plain zip
and add "filler" entry to the end manually like this:
import itertools
from typing import Iterable, Iterator, TypeVar, Optional
T = TypeVar('T')
def _pairwise(
iterable: Iterable[T], end: Optional[T] = None
) -> Iterator[tuple[T, Optional[T]]]:
left, right = itertools.tee(iterable)
next(right, None)
return zip(left, itertools.chain(right, [end]))
Now mypy
is happy about this code, and you're slightly more explicit: you know that second iterable (right
) is one element shorter than left
unless iterable
was empty, thus appending one item will make them equal. In case of empty input both implementation produce an empty iterator.