I have Python code that has the following shape to it:
from dataclasses import dataclass
@dataclass
class Foo_Data:
foo: int
class Foo_Processor:
def process(self, data: Foo_Data): ...
class Foo_Loader:
def load(self, file_path: str) -> Foo_Data: ...
#----------------------------------------------------------------
@dataclass
class Bar_Data:
bar: str
class Bar_Processor:
def process(self, data: Bar_Data): ...
class Bar_Loader:
def load(self, file_path: str) -> Bar_Data: ...
I have several instances of this sort of Data/Processor/Loader setup, and the classes all have the same method signatures modulo the specific class family (Foo, Bar, etc.). Is there a pythonic way of formalizing this relationship among classes to enforce a similar structure if I decide to create a Spam_Data
, Spam_Processor
, and Spam_Loader
family of classes? For instance, I want something to enforce that Spam_Processor
have a process
method which takes an argument of type Spam_Data
. Is there a way of achieving this standardization somehow with abstract classes, generic types, or some other structure?
I tried using abstract classes, but mypy correctly points out that having all *_Data classes be subclasses of an abstract Data
class and similarly having all *_Processor classes be subclasses of an abstract Processor
class violates the Liskov substitution principle, since each processor is only designed for its respective Data class (i.e., Foo_Processor
can't process Bar_Data
, but one would expect that it could if these classes have superclasses Processor
and Data
which are compatible in this way).
You can use abstract base classes (ABCs) with Generics. This way you can define a common interface while ensuring type safety:
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import Generic, TypeVar
# generic type variable for Data
T = TypeVar('T', bound='BaseData')
@dataclass
class BaseData(ABC):
pass
class BaseProcessor(ABC, Generic[T]):
@abstractmethod
def process(self, data: T) -> None:
pass
class BaseLoader(ABC, Generic[T]):
@abstractmethod
def load(self, file_path: str) -> T:
pass
Now you can define your specific classes
@dataclass
class Foo_Data(BaseData):
foo: int
class Foo_Processor(BaseProcessor[Foo_Data]):
def process(self, data: Foo_Data) -> None: ...
class Foo_Loader(BaseLoader[Foo_Data]):
def load(self, file_path: str) -> Foo_Data: ...
@dataclass
class Bar_Data(BaseData):
bar: str
class Bar_Processor(BaseProcessor[Bar_Data]):
def process(self, data: Bar_Data) -> None: ...
class Bar_Loader(BaseLoader[Bar_Data]):
def load(self, file_path: str) -> Bar_Data: ...
Writing your code this way combines the benefits of a common interface with type safety.
ABCs ensure that subclasses implement required methods, promoting a consistent structure.
Generics allow for type-specific operations, enhancing code readability and maintainability.
As a confirmation with mypy:
mypy script.py
Success: no issues found in 1 source file