Search code examples
pythoninheritanceclass-variablespython-dataclasses

Python dataclass inheritance with class variables


Consider the following example code

from dataclasses import dataclass, field
from typing import ClassVar


@dataclass
class Base:
    x: str = field(default='x', init=False)


@dataclass
class A(Base):
    name: str


@dataclass
class B(Base):
    name: str


a = A('test_a')
b = B('test_b')

a.x = 'y'
a.x  # prints 'y'
b.x  # prints 'x'

which prints 'y' and 'x' as expected.

Now I'd like to make x a ClassVar of type dict:

from dataclasses import dataclass, field
from typing import ClassVar, Dict


@dataclass
class Base:
    x: ClassVar[Dict[str, str]] = field(default={'x': 'x'}, init=False)


@dataclass
class A(Base):
    name: str


@dataclass
class B(Base):
    name: str


a = A('test_a')
b = B('test_b')

a.x['y'] = 'y'
a.x
b.x

However, now the output is

a.x => {'x': 'x', 'y': 'y'}
b.x => {'x': 'x', 'y': 'y'}

I'd expect that only a.x gets modified and b.x stays at the default init value `{'x': 'x'}.

If the field would not be a ClassVar then I could use the default_factory=dict but that doesn't work in combination with ClassVar since it returns the error

 Field cannot have a default factory

Solution

  • Class variables are shared between the parent and all child classes, so what you seem to want (a class variable that is declared in a parent, but child classes get their own copy they can manipulate) is conceptually impossible.

    If you want to do it properly, you have to re-declare the class variable in every child:

    from dataclasses import dataclass
    from typing import ClassVar, Dict
    
    
    @dataclass
    class Base:
        x: ClassVar[Dict[str, str]] = {'x': 'x'}
    
    
    @dataclass
    class A(Base):
        x: ClassVar[Dict[str, str]] = {'x': 'x'}
        name: str
    
    
    @dataclass
    class B(Base):
        x: ClassVar[Dict[str, str]] = {'x': 'x'}
        name: str
    
    a = A('test_a')
    b = B('test_b')
    
    a.x['y'] = 'y'
    a.x
    b.x
    

    Which now gives

    a.x => {'x': 'x', 'y': 'y'}
    b.x => {'x': 'x'}
    

    But if that is too cumbersome or impractical, I have this nifty footgun for you. Rather than using ClassVars, it just programs your requirements explicitly into the base class as a function, and makes it look like an attribute with the @property decorator:

    from dataclasses import dataclass
    from typing import Dict
    
    
    @dataclass
    class Base:
    
        @property
        def x(self) -> Dict[str, str]:
            cls = type(self)
            # first call per child class instance will initialize a proxy
            if not hasattr(cls, "_x"):
                setattr(cls, "_x", {"x": "x"})  # store the actual state of "x" in "_x"
            return getattr(cls, "_x")
    
    
    @dataclass
    class A(Base):
        name: str
    
    
    @dataclass
    class B(Base):
        name: str
    
    
    a_1 = A('test_a_1')
    a_2 = A('test_a_2')
    b = B('test_b')
    
    a_1.x['y'] = 'y'
    a_1.x
    a_2.x
    b.x
    

    This correctly shares x only among child-class instances as well, but without you needing to write an additional line into each new child:

    a.x => {'x': 'x', 'y': 'y'}
    a_1.x => {'x': 'x', 'y': 'y'}
    b.x => {'x': 'x'}
    

    One caveat is that, unlike with ClassVars, you can't call a property on the class without having an instance, e.g. A.x won't work. But it seems you weren't trying to do that anyway.