Search code examples
pythontype-hintingmypy

Python type-hints where only the first item is None


I am implementing a priority queue class in Python using a binary heap, which has a attribute that stores the binary tree. In my implementation, I'm using 1-indexing because it has the nice property that I can use lists (say, the list is called heap) and the children of heap[i] are exactly heap[2*i] and heap[2*i+1], which are nice.

I am padding the heap attribute with an initial inaccessible None to facilitate this 1-indexed array.

Below is the implementation I've got (minimal working example):

Item = TypeVar("Item")

class PriorityQueue(Generic[Item]):
    def __init__(self) -> None:
        self._heap: list[Optional[Item]] = [None]
        # ...

but then this runs into problems at

    def dequeue(self) -> Item:
        # ...
        return item_dequeued

This method returns an item from the end of the heap list, which we can guarantee to be of the type Item (not None). However, MyPy was like

mypy: error
return-value - Incompatible return value type (got "Optional[int]", expected "int")

But in the type-hint for the method I didn't really want to change the return type to Optional[Item] because it less accurately reflects the behaviour of the method.

What's the correct way to type-hint all this?

Thank in advance!

I have a few ideas in mind:

  1. Pad the list with an Item instead. But this makes the code less obvious, hiding the fact that the first item in a dummy.

  2. dummy_item = -1
    self._heap = [dummy_item] # now mypy can correctly guess the type of _heap
    

    but this feels wrong to me somehow.

  3. Put in # type: ignore[return-value] (possibly last resort?)

I'm looking for the most correct way to do this, following the guidelines of mypy or just more being pythonic in general.


Solution

  • (This answer predates the change from Item being an alias for int to a type variable for a generic queue. For the generic class, I would be even more inclined to give up on 1-based indexing, because now you either use the too-loose type list[Optional[Item]] to describe the heap, or you make the user responsible for providing an appropriate Item value for queue-specific internal purposes.)


    None serves no semantic purpose here, so using it just complicates your typing. Any arbitrary integer would work just as well, because it's an internal implementation detail of your class.

    class PriorityQueue:
        Item: TypeAlias = int
    
        def __init__(self) -> None:
            self._heap: list[PriorityQueue.Item] = [0]
            # ...
    

    If you really want, you can use a class "constant" to emphasize its actual value.

    class PriorityQueue:
        Item: TypeAlias = int
        _SHIM: ClassVar[Item] = 0
    
        def __init__(self) -> None:
            self._heap: list[PriorityQueue.Item] = [PriorityQueue._SHIM]
            # ...
    

    Or, don't worry about 1-based indexing. With 0-based indexing, the arithmetic works out to be parent i has children 2*i + 1 and 2*i + 2; 1-based indexing only saves you an addition when specifically traversing to the left, and nobody outside PriorityQueue needs to know or care about the indexing. (For all they know, you aren't using an array at all, and have an explicit tree built from nodes like

    class HeapNode(Generic[Item]):
        value: Item
        left: HeapNode[Item]
        right: HeapNode[Item]
    

    )