Search code examples
pythonclassrefactoringcompositionmetaclass

MetaClasses to remove duplication of class definitions


I have a few classes defined as below in Python:

class Item:
    def __init__(self, name):
        self.name = name

class Group:
    def __init__(self, name):
        self.name = name
        self.items = {}

    def __getitem__(self, name):
        return self.items[name]

    def __setitem__(self, name, item):
        self.items[name] = item

class Section:
    def __init__(self, name):
        self.name = name
        self.groups = {}

    def __getitem__(self, name):
        return self.groups[name]

    def __setitem__(self, name, group):
        self.groups[name] = group

class List:
    def __init__(self, name):
        self.name = name
        self.sections = {}

    def __getitem__(self, name):
        return self.sections[name]

    def __setitem__(self, name, section):
        self.sections[name] = section

The pattern of Group, Section and List is similar. Is there a way in Python using MetaClasses to refactor this to avoid code duplication?


Solution

  • Yes - I'd do it using inheritance as well, but instead of having the specific attribute name defined in __init__, would set it as a class attribute. The base could even be declared as abstract.

    class GroupBase():
        collection_name = "items"
        
        def __init__(self, name):
            self.name = name
            setattr(self.collection_name, {})
    
        def __getitem__(self, name):
            return getattr(self, self.collection_name)[name]
    
    
        def __setitem__(self, name, item):
            getattr(self, self.collection_name)[name] = item
        
    class Section(GroupBase):
        collection_name = "groups"
    
    class List(GroupBase):
        collection_name = "sections"
    

    Note that more class attributes could be used at runtime, for example to specify the item type for each collection, and enforce typing inside __setitem__, if needed.

    Or, as you asked, it is possible to literally use a string-template system and just use an "exec" statement inside a metaclass to create new classes. That would be closer to what "templates" are. The class code itself would live inside a string, and the patterns can use normal strign substitution with .format(). The major difference with C++ templates is that the language runtime itself will do the substitution at runtime - instead of compile (to bytecode) time. The exec function actually causes the text templat to be compiled at this point - yes, it is slower than pre-compiled code, but since it is run just once, at import time, that does not make a difference:

    group_class_template = """\
    class {name}:
        def __init__(self, name):
            self.name = name
            self.{collection_name} = {{}}
    
        def __getitem__(self, name):
            return self.{collection_name}[name]
    
        def __setitem__(self, name, item):
            self.{collection_name}[name] = item
    """
    
    class TemplateMeta(type):
        def __new__(mcls, name, bases, cls_namespace, template):
            # It would be possible to run the template with the module namespace
            # where the stub is defined, so that expressions
            # in the variables can access the namespace there
            # just set the global dictionary where the template
            # will be exec-ed to be the same as the stub's globals:
            # modulespace = sys._getframe().f_back.f_globals
    
            # Othrwise, keeping it simple, just use an empty dict:
            modulespace = {}
    
            cls_namespace["name"] = name
    
            exec(template.format(**cls_namespace), modulespace)
            # The class is execed actually with no custom metaclass - type is used.
            # just return the created class. It will be added to the modulenamespace,
            # but special attributes like "__qualname__" and "__file__" won't be set correctly.
            # they can be set here with plain assignemnts, if it matters that they are correct.
            return modulespace[name]
    
    
    class Item:
        def __init__(self, name):
            self.name = name
    
    
    class Group(metaclass=TemplateMeta, template=group_class_template):
        collection_name = "items"
    
    
    class Section(metaclass=TemplateMeta, template=group_class_template):
        collection_name = "groups"
        
    
    class List(metaclass=TemplateMeta, template=group_class_template):
        collection_name = "sections"
    

    And pasting this in the REPL I can just use the created classes:

    In [66]: a = Group("bla")
    
    In [67]: a.items
    Out[67]: {}
    
    In [68]: a["x"] = 23
    
    In [69]: a["x"]
    Out[69]: 23
    
    
    In [70]: a.items
    Out[70]: {'x': 23}
    
    

    The major drawback of doing it this way is that the template itself is seem just as a string, and the tooling like linters, static type checkers, auto-complete based in static scannng in IDEs, won't work for the templated classes. The idea could be evolved so that templates would be valid Python code, in ".py" files - they can be read as any other file at import time - one'd just need to specify a templating system other than using the built-in str.format so that templates could be valid code. For example, if one defines that names prefixed and ending with a single underscore are names that will be substituted in the template, regular expressions could be used for the name-replacement insteaf of .format.