Search code examples
pythonpython-3.xpropertiespython-dataclasses

Weird Issue when using dataclass and property together


I ran into a strange issue while trying to use a dataclass together with a property.

I have it down to a minumum to reproduce it:

import dataclasses

@dataclasses.dataclass
class FileObject:
    _uploaded_by: str = dataclasses.field(default=None, init=False)
    uploaded_by: str = None

    def save(self):
        print(self.uploaded_by)

    @property
    def uploaded_by(self):
        return self._uploaded_by

    @uploaded_by.setter
    def uploaded_by(self, uploaded_by):
        print('Setter Called with Value ', uploaded_by)
        self._uploaded_by = uploaded_by

p = FileObject()
p.save()

This outputs:

Setter Called with Value  <property object at 0x7faeb00150b0>
<property object at 0x7faeb00150b0>

I would expect to get None instead of

Am I doing something wrong here or have I stumbled across a bug?

After reading @juanpa.arrivillaga answer I thought that making uploaded_by and InitVar might fix the issue, but it still return a property object. I think it is because of the this that he said:

the datalcass machinery interprets any assignment to a type-annotated variable in the class body as the default value to the created __init__.

The only option I can find that works with the default value is to remove the uploadedby from the dataclass defintion and write an actual __init__. That has an unfortunate side effect of requiring you to write an __init__ for the dataclass manually which negates some of the value of using a dataclass. Here is what I did:

import dataclasses

@dataclasses.dataclass
class FileObject:
    _uploaded_by: str = dataclasses.field(default=None, init=False)
    uploaded_by: dataclasses.InitVar=None
    other_attrs: str = None

    def __init__(self, uploaded_by=None, other_attrs=None):
        self._uploaded_by = uploaded_by
        self.other_attrs = other_attrs

    def save(self):
        print("Uploaded by: ", self.uploaded_by)
        print("Other Attrs: ", self.other_attrs)

    @property
    def uploaded_by(self):
        if not self._uploaded_by:
            print("Doing expensive logic that should not be repeated")
        return self._uploaded_by

p = FileObject(other_attrs="More Data")
p.save()

p2 = FileObject(uploaded_by='Already Computed', other_attrs="More Data")
p2.save()

Which outputs:

Doing expensive logic that should not be repeated
Uploaded by:  None
Other Attrs:  More Data
Uploaded by:  Already Computed
Other Attrs:  More Data

The negatives of doing this:

  • You have to write boilerplate __init__ (My actual use case has about 20 attrs)
  • You lose the uploaded_by in the __repr__, but it is there in _uploaded_by
  • Calls to asdict, astuple, dataclasses.replace aren't handled correctly

So it's really not a fix for the issue

I have filed a bug on the Python Bug Tracker: https://bugs.python.org/issue39247


Solution

  • So, unfortunately, the @property syntax is always interpreted as an assignment to uploaded_by (since, well, it is). The dataclass machinery is interpreting that as a default value, hence why it is passing the property object! It is equivalent to this:

    In [11]: import dataclasses
        ...:
        ...: @dataclasses.dataclass
        ...: class FileObject:
        ...:     uploaded_by: str
        ...:     _uploaded_by: str = dataclasses.field(repr=False, init=False)
        ...:     def save(self):
        ...:         print(self.uploaded_by)
        ...:
        ...:     def _get_uploaded_by(self):
        ...:         return self._uploaded_by
        ...:
        ...:     def _set_uploaded_by(self, uploaded_by):
        ...:         print('Setter Called with Value ', uploaded_by)
        ...:         self._uploaded_by = uploaded_by
        ...:     uploaded_by = property(_get_uploaded_by, _set_uploaded_by)
        ...: p = FileObject()
        ...: p.save()
    Setter Called with Value  <property object at 0x10761e7d0>
    <property object at 0x10761e7d0>
    

    Which is essentially acting like this:

    In [13]: @dataclasses.dataclass
        ...: class Foo:
        ...:     bar:int = 1
        ...:     bar = 2
        ...:
    
    In [14]: Foo()
    Out[14]: Foo(bar=2)
    

    I don't think there is a clean way around this, and perhaps it could be considered a bug, but really, not sure what the solution should be, because essentially, the datalcass machinery interprets any assignment to a type-annotated variable in the class body as the default value to the created __init__. You could perhaps either special-case the @property syntax, or maybe just the property object itself, so at least the behavior for @property and x = property(set_x, get_x) would be consistent...

    To be clear, the following sort of works:

    In [22]: import dataclasses
        ...:
        ...: @dataclasses.dataclass
        ...: class FileObject:
        ...:     uploaded_by: str
        ...:     _uploaded_by: str = dataclasses.field(repr=False, init=False)
        ...:     @property
        ...:     def uploaded_by(self):
        ...:         return self._uploaded_by
        ...:     @uploaded_by.setter
        ...:     def uploaded_by(self, uploaded_by):
        ...:         print('Setter Called with Value ', uploaded_by)
        ...:         self._uploaded_by = uploaded_by
        ...:
        ...: p = FileObject(None)
        ...: print(p.uploaded_by)
    Setter Called with Value  None
    None
    
    In [23]: FileObject()
    Setter Called with Value  <property object at 0x1086debf0>
    Out[23]: FileObject(uploaded_by=<property object at 0x1086debf0>)
    

    But notice, you cannot set a useful default value! It will always take the property... Even worse, IMO, if you don't want a default value it will always create one!

    EDIT: Found a potential workaround!

    This should have been obvious, but you can just set the property object on the class.

    import dataclasses
    import typing
    @dataclasses.dataclass
    class FileObject:
        uploaded_by:typing.Optional[str]=None
    
        def _uploaded_by_getter(self):
            return self._uploaded_by
    
        def _uploaded_by_setter(self, uploaded_by):
            print('Setter Called with Value ', uploaded_by)
            self._uploaded_by = uploaded_by
    
    FileObject.uploaded_by = property(
        FileObject._uploaded_by_getter,
        FileObject._uploaded_by_setter
    )
    p = FileObject()
    print(p)
    print(p.uploaded_by)