Search code examples
pythonstatic-methodspython-dataclasses

Python access dataclass default_factory field without instantiating it


I have a classmethod create_instance() of a dataclass Bar that needs to check on a field(default_factory=...)-type parameter of the dataclass before returning an instance of the dataclass:

from dataclasses import dataclass, field
from typing import Dict


@dataclass
class Bar:
    number: int
    static_dict: Dict[str, int] = field(
        default_factory=lambda: {
            'foo': 1,
        }
    )

    @classmethod
    def create_instance(cls) -> 'Bar':
        access = cls.static_dict
        return cls(number=access['foo'])

if __name__ == '__main__':
    foo = Bar.static_dict  # 1) fails already
    foo = Bar.create_instance()  # 2) fails too since it uses 1)

This fails with:

AttributeError: type object 'Bar' has no attribute 'static_dict'

Without type-hinting the return value, e.g.

@classmethod
def create_instance(cls):
    access = cls.static_dict
    return cls(number=access['foo-1'])

if fails by mentioning the dataclass itself doesn't have the attribute:

AttributeError: type object 'Bar' has no attribute 'static_dict'.

The same error would occur using a staticmethod e.g. with type hinting:

@staticmethod
def create_instance_static() -> 'Bar':
    access = Bar.static_dict
    return Bar(number=access['foo'])

My very ugly workaround is to define a global variable and pass it into the class, which lets me use the create_instance()-method:

from dataclasses import dataclass, field
from typing import Dict

GLOBAL_DICT = {
    'foo': 2,
}


@dataclass
class BarWorkaround:
    number: int
    static_dict: Dict[str, int] = field(
        default_factory=lambda: GLOBAL_DICT
    )

    @classmethod
    def create_instance(cls) -> 'BarWorkaround':
        access = GLOBAL_DICT
        return cls(number=access['foo'])


if __name__ == '__main__':
    foo = BarWorkaround.create_instance()  # works as intended

Can't say I'm a fan of defining globals for something like this. My guess is something about the way dataclasses are constructed interferes with my accessing the static default attribute, but I don't get what it is.


Solution

  • You can consider converting the create_instance to a class method, and then calling dataclasses.fields, which returns a list of dataclass fields for a provided dataclass. Each element in the fields tuple will be a Field type.

    For example:

    from dataclasses import dataclass, field, fields
    from typing import Dict
    
    
    @dataclass
    class Bar:
        number: int
        static_dict: Dict[str, int] = field(
            default_factory=lambda: {
                'foo': 1,
            }
        )
    
        @classmethod
        def create_instance(cls) -> 'Bar':
            access = next(f for f in fields(Bar)
                          if f.name == 'static_dict').default_factory()
    
            return Bar(number=access['foo'])
    
    
    if __name__ == '__main__':
        # this should fail (`static_dict` is an instance - not class - attribute)
        # foo = Bar.static_dict
    
        # this should work though
        static_dict_field = next(f for f in fields(Bar) if f.name == 'static_dict')
        lambda_fn = static_dict_field.default_factory
        print(lambda_fn())  # {'foo': 1}
    
        foo = Bar.create_instance()  # 2) works with `dataclasses.fields`
    

    If you plan to call create_instance multiple times, for performance reasons it might be a good idea to cache the default_factory value, for ex. using a "cached" class property, or a simple workaround like this:

    from dataclasses import dataclass, field, fields
    from typing import Dict, ClassVar, Callable
    
    
    @dataclass
    class Bar:
        number: int
        static_dict: Dict[str, int] = field(
            default_factory=lambda: {
                'foo': 1,
            }
        )
        # added for type hinting and IDE support
        #
        # note: `dataclasses` ignores anything annotated with `ClassVar`, or else
        # not annotated to begin with.
        __static_dict_factory__: ClassVar[Callable[[], Dict[str, int]]]
    
        @classmethod
        def create_instance(cls) -> 'Bar':
            return Bar(number=cls.__static_dict_factory__()['foo'])
    
    
    # need to set it outside, or perhaps consider using a metaclass approach
    setattr(Bar, '__static_dict_factory__',
            next(f for f in fields(Bar) if f.name == 'static_dict').default_factory)
    
    
    if __name__ == '__main__':
        foo1 = Bar.create_instance()
        foo2 = Bar.create_instance()
    
        foo1.static_dict['key'] = 123
        print(foo1.static_dict)
    
        # assert each instance gets its own copy of the dict
        assert foo1.static_dict != foo2.static_dict