Search code examples
pythondata-classpython-dataclasses

What is the proper way in Python to define a dataclass that has both an auto generated __init__ and an additional init2 from a dict of values


In Python, I have a dataclass that holds over a dozen members. I use it to create a dict that I post into ElasticSearch.

Now I want to get a dict from ElasticSearch and use it to initialize the dataclass.

Since:

  1. Python doesn't allow to create a second __ init __ with a different signature.
  2. I don't want to manually write the __ init __ which is auto-generated just to add an optional parameter
  3. I don't want to add an optional parameter to accept the dict, just so that the __ init __ remains auto-generated.

I thought of adding a 2nd method init2, which will return an instance of the dataclass and parse the passed dict parameter into the auto-generated __ init __ method.


I would appriciate your input to decide if my suggested solution below is the correct implementation.

Also, Can this implementation be considered as a type of factory?

Thanks.


Follow up: Since the JSON\dictionary I get from the ES request is:

  1. Has exactly the same keywords as the dataclass

  2. Is flat, i.d., there are no nested objects.

I could simply pass the values as a **dict into the the auto-generated __ init __ method.

See my answer below for this specific case:


from dataclasses import dataclass

@dataclass
class MyData:
    name: str
    age: int = 17

    @classmethod
    def init_from_dict(cls, values_in_dict: dict):
        # Original line using MyData was fixed to use cls, following @ForceBru 's comment
        # return MyData(values_in_dict['name'], age=values_in_dict['age'])
        return cls(values_in_dict['name'], age=values_in_dict['age'])

my_data_1: MyData = MyData('Alice')
print(my_data_1)

my_data_2: MyData = MyData('Bob', 15)
print(my_data_2)

values_in_dict_3: dict = {
    'name': 'Carol',
    'age': 20
}

my_data_3: MyData = MyData.init_from_dict(values_in_dict_3)
print(my_data_3)

# Another init which uses the auto-generated __init__ works in this specific
# case because the values' dict is flat and the keywords are the same as the
# parameter names in the dataclass.
# This allows me to do this
my_data_4: MyData = MyData(**values_in_dict_3)

Solution

  • There's a potential bug in your code. Consider this:

    class Thing:
        def __init__(self, a, b):
            self.a, self.b = a, b
    
        @classmethod
        def from_int(cls, value):
            return Thing(value, value + 1)
    
    class AnotherOne(Thing):
        def __init__(self, a, b):
            self.a, self.b = a + 1, b + 2
    

    Now, if you run AnotherOne.from_int(6) you'd get a Thing object:

    >>> AnotherOne.from_int(6)
    <__main__.Thing object at 0x8f4a04c>
    

    ...while you probably wanted to create a AnotherOne object!

    To fix this, create the object like this:

    class Thing:
        ...
    
        @classmethod
        def from_int(cls, value):
            return cls(value, value + 1)  # Use `cls` instead of `Thing`
    

    I think your code is otherwise fine: indeed, one of the usages of classmethod is providing other ways to initialize an instance of a class than using __init__.