Search code examples
python-attrs

Attr: Deserialize deeply nested json?


I have a deeply nested JSON structure like this:

  json_data =  """{
      "title: "...",
      "links": [
        {
          "href": "string",
          "method": {
            "method": "string"
          },
          "rel": "string"
        }
      ]
    }"""

my classes:

import attr
import typing as ty
import enum
    class HttpMethod(enum.Enum):
        GET = 0
        HEAD = 1
        POST = 2
        PUT = 3
        DELETE = 4
        CONNECT = 5
        OPTIONS = 6
        TRACE = 7
        PATCH = 8


@attr.s(frozen=True, auto_attribs=True)
class LinkInfo:
    href: str = attr.ib()
    link: str = attr.ib(converter=ensure_cls)
    method: HttpMethod = attr.ib(converter=lambda x: x.name)

@attr.s(frozen=True, auto_attribs=True)
class AnalysisTemplateGetModel:
    title: x: str = attr.ib()
    links: ty.List[mlinks.LinkInfo]= attr.ib(default=attr.Factory(list))

and I want to deserialize them to attr classes,

js = json.loads(**json_data)

but the links field of js is still a dictionary, not a LinkInfo object?

str(js)
AnalysisTemplateGetModel(..., links=[{'href': 'string', 'method': {'method': 'string'}, 'rel': 'string'}])

Solution

  • Let me edit your code examples to make them consistent. Also, let me use a more up to date version of attrs>=20.1.0 on Python 3.10

    Assume the json string input

    json_data = """{
        "title": "my_title",
        "links": [
            {
                "href": "some_href",
                "method": "POST",
                "rel": "some_rel"
            },
            {
                "href": "another_href",
                "method": "GET",
                "rel": "another_rel"
            }
        ]
    }"""
    

    You can achieve nested deserialization using converters for fields that are not native json data types

    from collections.abc import Iterable
    import json
    import typing as ty
    import enum
    from attrs import frozen, field, Factory
    
    
    class HttpMethod(enum.Enum):
        GET = 0
        HEAD = 1
        POST = 2
        PUT = 3
        DELETE = 4
        CONNECT = 5
        OPTIONS = 6
        TRACE = 7
        PATCH = 8
    
    
    def ensure_http_method(data: str | HttpMethod) -> HttpMethod:
        if isinstance(data, str):
            data = HttpMethod[data]
        return data
    
    
    @frozen
    class LinkInfo:
        href: str
        rel: str
        method: HttpMethod = field(converter=ensure_http_method)
    
    
    def ensure_list_of_linkinfos(
        iterable: Iterable[ty.Dict[str, ty.Any] | LinkInfo]
    ) -> ty.List[LinkInfo]:
        return [
            link_info if isinstance(link_info, LinkInfo) else LinkInfo(**link_info)
            for link_info in iterable
        ]
    
    
    @frozen
    class AnalysisTemplateGetModel:
        title: str
        links: ty.List[LinkInfo] = field(
            default=Factory(list), converter=ensure_list_of_linkinfos
        )
    
    analysis_template = AnalysisTemplateGetModel(**json.loads(json_data))
    print(analysis_template)
    

    which will give

    AnalysisTemplateGetModel(title='my_title', links=[LinkInfo(href='some_href', rel='some_rel', method=<HttpMethod.POST: 2>), LinkInfo(href='another_href', rel='another_rel', method=<HttpMethod.GET: 0>)])
    

    However, let me point you to the excellent sister project of attrs, cattrs, which solves exactly that problem of (de)serializing nested attrs classes.

    With cattrs you can skip the converters and just do

    
    import json
    import typing as ty
    import enum
    
    from attrs import frozen, field, Factory
    from cattrs import Converter
    
    class HttpMethod(enum.Enum):
        GET = 0
        HEAD = 1
        POST = 2
        PUT = 3
        DELETE = 4
        CONNECT = 5
        OPTIONS = 6
        TRACE = 7
        PATCH = 8
    
    
    @frozen
    class LinkInfo:
        href: str
        rel: str
        method: HttpMethod
    
    @frozen
    class AnalysisTemplateGetModel:
        title: str
        links: ty.List[LinkInfo]
    
    
    converter = Converter()
    # this is necessary, because cattrs serializes enums by its value per default 
    # however you want the opposite, so we need to tell it
    converter.register_structure_hook(enum.Enum, lambda string, cl: cl[string])
    
    print(converter.structure(json.loads(json_data), AnalysisTemplateGetModel))
    

    This also gives

    AnalysisTemplateGetModel(title='my_title', links=[LinkInfo(href='some_href', rel='some_rel', method=<HttpMethod.POST: 2>), LinkInfo(href='another_href', rel='another_rel', method=<HttpMethod.GET: 0>)])
    

    and is much more concise and less invasive (you don't have to add anything to your classes to make serialization work, that's the beauty of cattrs).