Search code examples
pythonpython-dataclasses

Dataclass in python does not raise error when the class variable is assigned as a list (but does with typing hints)


I'm just trying to get myself familiar with dataclass in python. One thing I learned from some readings online is that, we can turn the regular class definition with a mutable class variable (which is a bad thing), into dataclass and that would prevent it. For example:

regular class:

class A:
    a = []
    
    def __init__(self):
        self.b = 1

this could have potential issue where different instances share the same class variable a, and modify a unknowingly.

and with dataclass:

@dataclass
class A:
    a: list = []

    def __init__(self):
        self.b = 1

this does not allow me to write this class by raising error:

ValueError: mutable default <class 'list'> for field a is not allowed: use default_factory

however, if I simply get rid of the type annotation:

@dataclass
class A:
    a = []

    def __init__(self):
        self.b = 1

there is no complaint at all and a is still shared across different instances.

Is this expected?

How come the simple type annotation would change the behavior of the class variable?

(I'm using python 3.7.6)


Solution

  • When you declare

    @dataclass
    class A:
        a = []
    
        def __init__(self):
            self.b = 1
    

    a is not a dataclass field. REF: https://github.com/ericvsmith/dataclasses/issues/2#issuecomment-302987864

    You can take a look at __dataclass_fields__ and __annotations__ fields after declaring the class.

    In [55]: @dataclass
        ...: class A:
        ...:     a: list = field(default_factory=list)
        ...:
        ...:     def __init__(self):
        ...:         self.b = 1
        ...:
    
    In [56]: A.__dict__
    Out[56]:
    mappingproxy({'__module__': '__main__',
                  '__annotations__': {'a': list},
                  '__init__': <function __main__.A.__init__(self)>,
                  '__dict__': <attribute '__dict__' of 'A' objects>,
                  '__weakref__': <attribute '__weakref__' of 'A' objects>,
                  '__doc__': 'A()',
                  '__dataclass_params__': _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False),
                  '__dataclass_fields__': {'a': Field(name='a',type=<class 'list'>,default=<dataclasses._MISSING_TYPE object at 0x7f8a27ada250>,default_factory=<class 'list'>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD)},
                  '__repr__': <function dataclasses.__repr__(self)>,
                  '__eq__': <function dataclasses.__eq__(self, other)>,
                  '__hash__': None})
    
    In [57]: @dataclass
        ...: class A:
        ...:     a = []
        ...:
        ...:     def __init__(self):
        ...:         self.b = 1
        ...:
    
    In [58]: A.__dict__
    Out[58]:
    mappingproxy({'__module__': '__main__',
                  'a': [],
                  '__init__': <function __main__.A.__init__(self)>,
                  '__dict__': <attribute '__dict__' of 'A' objects>,
                  '__weakref__': <attribute '__weakref__' of 'A' objects>,
                  '__doc__': 'A()',
                  '__dataclass_params__': _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False),
                  '__dataclass_fields__': {},
                  '__repr__': <function dataclasses.__repr__(self)>,
                  '__eq__': <function dataclasses.__eq__(self, other)>,
                  '__hash__': None})
    

    From PEP 557:

    The dataclass decorator examines the class to find fields. A field is defined as any variable identified in __annotations__. That is, a variable that has a type annotation. REF: How to add a dataclass field without annotating the type?

    Checks only happen on dataclass fields and not on class variables, Here is the check for field which is causing the error

      if f._field_type is _FIELD and isinstance(f.default, (list, dict, set)):
    

    Why mutable types are not allowed: https://docs.python.org/3/library/dataclasses.html#mutable-default-values