Search code examples
pythonpython-typingmypy

Is this the correct way of use Protocol, and if it's why mypy fails?


I have the following two classes:

@runtime_checkable
class AbstractFolder(Protocol):
    def __iter__(self) -> "AbstractFolder":
        raise NotImplementedError

    def __next__(self) -> AbstractFileReadable:
        raise NotImplementedError

and their implementation:

class FileSystemFolder(AbstractFolder, Iterator):
    def __init__(self, path: str):
        self.path: str = path

    def __iter__(self) -> "FileSystemFolder":
        self.jobs: List[AbstractFileReadable] = [
            FileSystemFileReadable(path)
            for path in [*glob(os.path.join(self.path, "*"))]
        ]
        self.current: int = -1
        return self

    def __next__(self) -> FileSystemFileReadable:
        self.current += 1

        if self.current >= len(self.jobs):
            raise StopIteration

        return self.jobs[self.current]

and the following function

def process(folder: AbstractFolder) -> None:
...

The children are returning instances of the implementation, which could be a lot, but when I execute mypy I get the following error:

error: Incompatible return value type (got "AbstractFileReadable", expected "FileSystemFileReadable")

Is this the correct way to implement and use Protocol and typing?

Thanks


Solution

  • The typing error that mypy is detecting has nothing to do with your use of Protocol. Rather, it's a conflict between the type of self.jobs (which is defined in __iter__) and the return value from __next__.

    Despite the code only loading instances of FileSystemFileReadable into the self.jobs list, its type is hinted as List[AbstractFileReadable], so theoretically, some other code could add a different type of object to it without violating the type declarations. When you take values out of the list, in __next__, you are returning them, and the return type is annotated as FileSystemFileReadable, but mypy thinks the list could contain the wider abstract type.

    The solution to this is to change one of the annotations. Probably self.jobs should be hinted as List[FileSystemFileReadable]. Less likely, __next__ should return the looser AbstractFileReadable, just like the protocol does.

    On another note, unrelated to the type hinting problem you were asking about, it's probably bad form for __iter__ to be doing the heavy lifting to set up the iteration in your code. A better idea would be to do it in __init__, while __iter__ has just return self as its entire body. With the current implementation, an idiom like this will not work as expected:

    f = FileSystemFolder("some/path")
    
    first = next(f) # get one value for special processing
    rest = list(f)  # put the rest of the values in a list
    

    With your code, the first element yielded by the iterator will appear in rest along with all the others, since the __iter__ call made by the list constructor will reset the iteration to the start.