Search code examples
pythondictionaryattributespython-dataclasses

How can I control the order of attributes when converting dataclasses to a dict in Python?


i want to used named data in my project but in the end i have to convert the data to a CSV file.

My approach is to use list of dataclasses in the code as data containers. In the end the list of dataclasses is converted to a list of dicts and from that to a CSV file.

I want to control the order in which the attributes appear in the final data object. Here is my code:

from dataclasses import dataclass

@dataclass
class MyClass:
    c: str
    b: str
    a: str = "Default"

objdata = [
    MyClass(c="Foo", b="Bar"),
    MyClass(c="Test", b="Test"),
    MyClass(c="asdf", b="yxcv")
    ]

dictdata = []
attributes = [a for a in dir(objdata[0]) if not a.startswith('__')]
for entry in objdata:
    csv_entry = {}
    for attribute in attributes:
        csv_entry[attribute] = getattr(entry, attribute)
    dictdata.append(csv_entry)

I have several dataclasses with different attributes. I want to reuse the code converting the dataclass to a dict, so i loop over the attributes in the class. The problem is dir returns the attributes in alphabetical order (so a, b c) and not in the order i declared them in the class (which is c, b, a). Is there a way to control this?


Solution

  • So, you should just use dataclasses.asdict, which implements this behavior for any object that is an instance of a class created by a class that was decorated with the dataclasses.dataclass code generator.

    >>> import dataclasses
    >>> @dataclasses.dataclass
    ... class MyClass:
    ...     c: str
    ...     b: str
    ...     a: str = "Default"
    ...
    >>> objdata = [
    ...     MyClass(c="Foo", b="Bar"),
    ...     MyClass(c="Test", b="Test"),
    ...     MyClass(c="asdf", b="yxcv")
    ...     ]
    >>>
    >>> for obj in objdata:
    ...     print(dataclasses.asdict(obj))
    ...
    {'c': 'Foo', 'b': 'Bar', 'a': 'Default'}
    {'c': 'Test', 'b': 'Test', 'a': 'Default'}
    {'c': 'asdf', 'b': 'yxcv', 'a': 'Default'}
    

    Note, this is implemented by using dataclasses.fields(obj), which ultimately just access the __dataclass_fields__ class variable. The details are all pretty easy to follow in the source code

    Note, if you simply wanted to write every object that you know is of some specific dataclass type to a csv, I wouldn't create intermediate dicts. You could just do:

    import operator
    import csv
    
    fields = [f.name for f in dataclasses.fields(MyClass)]
    getter_func = operator.attrgetter(*fields)
    
    with open("data.csv") as f:
        writer = csv.writer(f)
        writer.writerow(fields) # If you want a header
        writer.writerows(map(getter_func, objdata))