I'm trying to implement a simple ORM in python. I'm facing a code duplication issue and I do not know how to solve it. Here is a simplified example of a class in my project:
class Person:
TABLE_NAME = 'person'
FIELDS = [
('name', 'VARCHAR(50)'),
('age', 'INTEGER')
]
# CODE DUPLICATION: the two next lines shoudl be genereated with FIELDS not hard coded...
name: str
age: int
def __init__(self, **kwargs):
self.__dict__ = kwargs
@classmethod
def create_sql_table(cls):
# use TABLE_NAME and FIELDS to create sql table
pass
alice = Person(name='Alice', age=25)
print(alice.name)
If I remove the two lines name: str
and age: int
I lose auto-completion and I get a mypy error on the print line (Error: Person has no attribute name)
But If I keep it, I have code duplication (I write twice each field name).
Is there a way to avoid the code duplication (by generating this two lines using FIELDS variable for instance) ?
Or another way to implement this class that avoid code duplication (without mypy error and auto-completion loss) ?
You can use descriptors:
from typing import Generic, TypeVar, Any, overload, Union
T = TypeVar('T')
class Column(Generic[T]):
sql_type: str # the field type used for this column
def __init__(self) -> None:
self.name = '' # the name of the column
# this is called when the Person class (not the instance) is created
def __set_name__(self, owner: Any, name: str) -> None:
self.name = name # now contains the name of the attribute in the class
# the overload for the case: Person.name -> Column[str]
@overload
def __get__(self, instance: None, owner: Any) -> 'Column[T]': ...
# the overload for the case: Person().name -> str
@overload
def __get__(self, instance: Any, owner: Any) -> T: ...
# the implementation of attribute access
def __get__(self, instance: Any, owner: Any) -> Union[T, 'Column[T]']:
if instance is None:
return self
# implement your attribute access here
return getattr(instance, f'_{self.name}') # type: ignore
# the implementation for setting attributes
def __set__(self, instance: Any, value: T) -> None:
# maybe check here that the type matches
setattr(instance, f'_{self.name}', value)
Now we can create specializations for each column type:
class Integer(Column[int]):
sql_type = 'INTEGER'
class VarChar(Column[str]):
def __init__(self, size: int) -> None:
self.sql_type = f'VARCHAR({size})'
super().__init__()
And when you define the Person
class we can use the column types
class Person:
TABLE_NAME = 'person'
name = VarChar(50)
age = Integer()
def __init__(self, **kwargs: Any) -> None:
for key, value in kwargs.items():
setattr(self, key, value)
@classmethod
def create_sql_table(cls) -> None:
print("CREATE TABLE", cls.TABLE_NAME)
for key, value in vars(cls).items():
if isinstance(value, Column):
print(key, value.sql_type)
Person.create_sql_table()
p = Person(age=10)
print(p.age)
p.age = 20
print(p.age)
This prints:
CREATE TABLE person
name VARCHAR(50)
age INTEGER
10
20
You should probably also create a base Model
class that contains the __init__
and the class method of Person
You can also extend the Column
class to allow nullable columns and add default values.
Mypy does not complain and can correctly infer the types for Person.name
to str and Person.age
to int.