Search code examples
pythonclassinheritancemultiple-inheritance

How to uses dataclass special methods with multiple inheritance?


from dataclasses import dataclass, field
from typing import Dict


@dataclass
class A:
    a: Dict[str, int] = field(default_factory=dict)

    def __post_init__(self):
        self.a = {'a1': 0, 'a2': 0}


    def add_key_a(self, key):
        self.a['key'] = 0

@dataclass
class B:
    b: Dict[str, int] = field(default_factory=dict)

    def __post_init__(self):
        self.b = {'b1': 0, 'b2': 0}

    def add_key_b(self, key):
        self.b['key'] = 0

@dataclass
class C(A, B):
    pass

user = C()
print(user)
# C(b={}, a={'a1': 0, 'a2': 0})

I get an empty 'b' dictionary, but expected to get "{'b1': 0, 'b2': 0}". I did a search on the Internet and I didn't find a proper explanation and solution to this problem (probably should've search better). So, I ask you guys to help me find out how to solve this problem.


Solution

  • Using multiple inheritance requires classes to be cooperative by calling their super() methods in appropriate places. Just like __init__ should defer to super().__init__, __post_init__ should defer to super() .__post_init__.

    Since dataclasses do not have a common baseclass, deferring to a super method must be defensive; getattr with a no-op function can be used to skip the super call as needed.

    @dataclass
    class A:
        a: Dict[str, int] = field(default_factory=dict)
    
        def __post_init__(self):
            getattr(super(), "__post_init__", lambda: None)()
            self.a = {'a1': 0, 'a2': 0}
    
    
        def add_key_a(self, key):
            self.a['key'] = 0
    
    @dataclass
    class B:
        b: Dict[str, int] = field(default_factory=dict)
    
        def __post_init__(self):
            getattr(super(), "__post_init__", lambda: None)()
            self.b = {'b1': 0, 'b2': 0}
    
        def add_key_b(self, key):
            self.b['key'] = 0
    

    Naively, one would just use super().__post_init__() to call __post_init__ of the super class. But since dataclass works via code generation instead of inheritance, the super class is object – which has no __post_init__ method! Thus, the final lookup will fail:

    >>> c = C()
    >>> super(C, c).__post_init__  # initial __post_init__ used by C instances
    <bound method A.__post_init__ of C(b={}, a={'a1': 0, 'a2': 0})>
    >>> super(A, c).__post_init__  # second __post_init__ used by C 
    <bound method B.__post_init__ of C(b={}, a={'a1': 0, 'a2': 0})>
    >>> super(B, c).__post_init__  # final __post_init__ used by C 
    ...
    AttributeError: 'super' object has no attribute '__post_init__'
    

    The way to fix this is straightforward: just catch the AttributeError if it occurs and do nothing in that case. We could do that with try: except: blocks, but there is a terser way.

    The builtin getattr function allows to get an attribute or a default. Instead of a.b, we can use getattr(a, "b", default). Since we are getting a method to call, a useful default is a callable that does nothing.

    >>> lambda : None    # callable that does nothing
     <function __main__.<lambda>()>
    >>> # definition | call
    >>> (lambda: None)()  # calling does nothing
    >>> # getattr fetches attribute/method...
    >>> getattr(super(A, c), "__post_init__")
    <bound method B.__post_init__ of C(b={}, a={'a1': 0, 'a2': 0})>
    >>> # ... and can handle a default
    >>> getattr(super(B, c), "__post_init__", lambda: None)
    <function __main__.<lambda>()>
    

    Putting this in action, we replace the ….__post_init__ with getattr. Notably, just as we needed () for the call after the ….__post_init__ lookup, we still need () for the call after the getattr lookup.

    super().__post_init__()
    #super | method      | call
    
    #      |super |  | method      |  | default  | | call
    getattr(super(), "__post_init__", lambda: None)()