Search code examples
pythoninheritancepython-dataclasses

control initialize order when Python dataclass inheriting a class


What I kown
The Python dataclass allows inheritance, either with dataclass or class. In best practice (and also in other languages), when we do inheritance, the initialization should be called first. In Python it is:

def __init__(self):
    super().__init__()
    ...

What I'm doing
Since the dataclass was introduced in Python 3.7, I am considering replace all of my classes with the dataclass. With dataclass, one of its benefits is to generate __init__ for you. This is not good when the dataclass needs to inherit a base class -- for example:

class Base:
    def __init__(self):
        self.a = 1

@dataclass
class Child(Base):
    a:int
    def __post_init__(self):
        super().__init__() 

My problem
The problem is we have to put super initialization call inside __post_init__ which in fact is called after dataclass's init.
The downside is that we lose the convention contract and the initialization disorder leads to that we can not override attributes of super classes.

It can be solved by concept of __pre_init__. I've read the document and does not see anything to do with that concept there. Am I missing something?


Solution

  • Actually there is one method which is called before __init__: it is __new__. So you can do such a trick: call Base.__init__ in Child.__new__. I can't say is it a good solution, but if you're interested, here is a working example:

    class Base:
        def __init__(self, a=1):
            self.a = a
    
    
    @dataclass
    class Child(Base):
        a: int
    
        def __new__(cls, *args, **kwargs):
            obj = object.__new__(cls)
            Base.__init__(obj, *args, **kwargs)
            return obj
    
    
    c = Child(a=3)
    print(c.a)  # 3, not 1, because Child.__init__ overrides a