Search code examples
pythondesign-patternspython-typingmypy

Design pattern for combining separate branches into common structure


I've got applications that consist of a data loader and a data transformer. Each loader and each transformer are subclasses of an abstract base loader and an abstract base transformer, which I'll omit in the example below. There is a 1:1 mapping between the concrete loaders and transformers, i.e. is known which loader and transformer belong together.

Say we have two loaders and two transformers, handling data

class Data1: ...

class Data2: ...


class Loader1:
    def get_data(self) -> Data1: ...

class Loader2:
    def get_data(self) -> Data2: ...


class Transformer1:
    def transform_data(self, data: Data1) -> None: ...

class Transformer2:
    def transform_data(self, data: Data2) -> None: ...

These classes could now be combined into applications

class App1:
    Loader = Loader1
    Transformer = Transformer1

class App2:
    Loader = Loader2
    Transformer = Transformer2

with an accompanying factory

from typing import Union, Type

def make_app(use_app1: bool) -> Union[Type[App1], Type[App2]]:
    if use_app1:
        return App1
    else:
        return App2

This is how I'd like to use the above

def main(use_app1: bool) -> None:
    app = make_app(use_app1)
    loader = app.Loader()
    data = loader.get_data()
    transformer = app.Transformer()
    transformer.transform_data(data=data)

However, mypy complains:

error: Argument "data" to "transform_data" of "Transformer1" has incompatible type "Union[Data1, Data2]"; expected "Data1"  [arg-type]
error: Argument "data" to "transform_data" of "Transformer2" has incompatible type "Union[Data1, Data2]"; expected "Data2"  [arg-type]

Is there some way to convince mypy that the branches Loader1 -> Data1 -> Transformer1 and Loader2 -> Data2 -> Transformer2 are separate and will not be mixed?

Is there an alternative pattern that could be used for this use case?


Solution

  • Okay, here's a third attempt at solving this. In this attempt, I'm using abstract protocols to tell MyPy that, in fact, in a lot of these functions, it doesn't matter what specific type is being returned, as long as the object being returned has a certain interface.

    from typing import Type, Protocol, cast
    
    
    ### ABSTRACT INTERFACES ###
    
    
    class DataProto(Protocol): ...
        
    
    class LoaderProto(Protocol):
        def get_data(self) -> DataProto: ...
        
    
    class TransformerProto(Protocol):
        def transform_data(self, data: DataProto) -> None: ...
    
    
    class AppProto(Protocol):
        Loader: Type[LoaderProto]
        Transformer: Type[TransformerProto]
        
    
    ### CONCRETE IMPLEMENTATIONS ###
        
    
    class Data1: ...
    
    
    class Data2: ...
    
    
    class Loader1:
        def get_data(self) -> Data1: ...
    
    
    class Loader2:
        def get_data(self) -> Data2: ...
    
    
    class Transformer1:
        def transform_data(self, data: Data1) -> None: ...
    
    
    class Transformer2:
        def transform_data(self, data: Data2) -> None: ...
    
    
    class App1:
        Loader = Loader1
        Transformer = Transformer1
    
    
    class App2:
        Loader = Loader2
        Transformer = Transformer2
        
    
    GenericAppClassType = Type[AppProto]
        
    
    def make_app(use_app1: bool) -> GenericAppClassType:
        if use_app1:
            return cast(GenericAppClassType, App1)
        else:
            return cast(GenericAppClassType, App2)
    
    
    def main(use_app1: bool) -> None:
        app = make_app(use_app1)
        loader = app.Loader()
        data = loader.get_data()
        transformer = app.Transformer()
        transformer.transform_data(data=data)
    

    Try it out on mypy playground here.