Search code examples
pythongenericspython-typing

What is the proper type-hinting for this function which accepts either an instance or a builder, and returns the instance?


Here is a simple function that receives either an instance of Foo, in which case it returns the instance, or an instance of a class which can build Foo, in which case it builds and return a new instance of Foo.

The logic is simple and the code runs without error, but both pylance and mypy complain that my return types don't match the type hinting in the signature.

# python 3.9.5
from typing import TypeVar, Union


class Foo:
    pass


class FooBuilder:
    def build(self) -> Foo:
        return Foo()


_F = TypeVar("_F", bound=Foo)
_FB = TypeVar("_FB", bound=FooBuilder)


def get_foo(x: Union[_F, _FB]) -> _F:
    if isinstance(x, Foo):
        return x  # <<< mypy error #1
    elif isinstance(x, FooBuilder):
        return x.build()  # <<< mypy error #2; pylance error


assert isinstance(get_foo(Foo()), Foo)  # passes without error
assert isinstance(get_foo(FooBuilder()), Foo)  # passes without error

Pylance only complains about the second return:

Expression of type "Foo*" cannot be assigned to return type "_F@get_foo"

Whereas mypy complains about both:

main.py:19: error: Incompatible return value type (got "Foo", expected "_F") [return-value]

main.py:21: error: Incompatible return value type (got "Foo", expected "_F") [return-value]

Neither error was helpful when searching, other than to determine it doesn't think the return types match the signature, but to me, it looks like they do.

In real life, I'm passing instances that inherit from Foo and FooBuilder, thus the use of TypeVar, instead of simply using the classes directly in the signature of the function. It seems wrong that one of my types (_FB) is only present once in the signature instead of twice, which is usually why I'd use TypeVar; however, it's important the function take two different types, each with different base classes, yet only return the non-builder instance. Is there something other than TypeVar I should be using for this?


Solution

  • The problem does not only lie in the get_foo function annotation, but in your type guards within its body.

    Let's focus on the first case only for now. You specified that you can pass an instance of _F to get_foo and it will return an instance of that same type _F. That is what the type variable in a function signature is for: Binding to the type(s) of the passed argument(s) and collapsing the return type accordingly.

    Say you had the FooSub class inheriting from Foo and you passed an instance of FooSub to get_foo. The type variable _F would collapse to FooSub and the return type would be expected to be FooSub.

    But your isinstance(x, Foo) check in the function body acts as a type guard. Past that check, the type checker Mypy assumes x is of type Foo and no longer of whatever _F was. As we established you could have passed a FooSub object, which would require you to return a FooSub object. Since FooSub is a subtype of and thus more narrow than Foo, returning an instance of Foo is not type safe.


    <EDIT> The specific way in which a type checker should interpret this type guard is a matter of debate/implementation. Arguably the most sane approach is to still consider x to be of type _F because the upper bound defined for _F is Foo. Pyright seems to already treat it as such. But for Mypy this is still an unresolved issue. Thanks to @SUTerliakov for pointing this out in a comment.

    In other words, your use of the isinstance check is not wrong per se. It is just not clear enough for some type checkers in the presence of a type variable. </EDIT>


    If you know what you are doing and that isinstance approach is really necessary, you could simply cast the variable x to the type _F before returning or use a type: ignore directive. But I would not recommend that because I doubt that isinstance check is necessary at all. This brings me to the second case.

    FooBuilder as you showed it above is not generic. Its build return type is fixed to Foo. Therefore whatever subtype of FooBuilder is passed to get_foo, the output of its build method call will always be inferred as an instance of Foo and nothing else, specifically not _F.

    So assuming subclasses of FooBuilder can actually return different types (subclasses of Foo), you should probably make FooBuilder generic anyway. Although my suggested solution below works fine even if FooBuilder is not generic.

    Then, if all you care about in the second use case is that the object passed to get_foo has a build method and is generic in terms of that method's return type, you could define a generic protocol for that. This part is actually important.

    Lastly, if you make that protocol runtime_checkable, you can start with an isinstance check for that protocol, do what you need to do with that builder object and eventually return its build output. Since that covers one of the two possible types of the union the argument x is an instance of, the type checker will know that the remaining branch (equivalent to the previous isinstance check returning False) covers the other type, namely _F. So you do not need to explicitly check that again. Instead you can simply assume that x will be of type _F.

    Altogether, I would recommend something like this:

    from typing import Generic, Protocol, TypeVar, Union, runtime_checkable
    
    
    class Foo:
        ...
    
    
    _F = TypeVar("_F", bound=Foo, covariant=True)
    
    
    class FooBuilder(Generic[_F]):
        """Example of the actual builder base class"""
        def __init__(self, foo_cls: type[_F]) -> None:
            self.foo_cls = foo_cls
    
        def build(self) -> _F:
            return self.foo_cls()
    
    
    @runtime_checkable
    class BuildsFoo(Protocol[_F]):
        def build(self) -> _F: ...
    
    
    def get_foo(x: Union[_F, BuildsFoo[_F]]) -> _F:
        if isinstance(x, BuildsFoo):
            return x.build()
        return x
    

    (That is a literal ellipsis ... in the BuildsFoo.build method.)

    Usage demo:

    ...
    
    class Bar(Foo):
        ...
    
    
    obj1 = get_foo(Bar())
    obj2 = get_foo(FooBuilder(Bar))
    reveal_type(obj1)  # note: Revealed type is "Bar"
    reveal_type(obj2)  # note: Revealed type is "Bar"
    

    Here is a Mypy Playground with this code.