I'm working with a JSON data structure and am trying to represent it as a dataclass. The data structure is (partly) circular and I want the nested data structures to be neatly represented as dataclasses as well.
I am having some trouble getting the dataclasses to parse correctly. See the simplified example below:
from typing import List, Optional, Union
class SchemaTypeName(Enum):
LONG = "long"
NULL = "null",
RECORD = "record"
STRING = "string"
@dataclass_json
@dataclass
class SchemaType():
type: Union[
SchemaTypeName,
'SchemaType',
List[
Union[
SchemaTypeName,
'SchemaType'
]
]
]
fields: Optional[List['SchemaType']] = None
name: Optional[str] = None
Below is a printout of the object returned after calling from_dict
with some sample data. Notice that the nested object (indicated with the arrow) is not parsed as a dataclass correctly.
SchemaType(
type=[
'null',
------> {
'fields': [
{'name': 'id', 'type': 'string'},
{'name': 'date', 'type': ['null', 'long']},
{'name': 'name', 'type': ['null', 'string']}
],
'type': 'record'
}
]
)
Am I declaring the type hint for the type
field incorrectly?
I'm using Python 3.9
with dataclasses_json==0.5.2
and marshmallow==3.11.1
.
I found that the problem was related to dataclasses_json
not decoding my elements correctly when they are in a list. Having mixed types in a list causes the decoder to return a list of basic string
s and dict
s, without transforming them to instances of SchemaType
and SchemaTypeName
.
However, dataclasses_json
allows you to configure a custom decoder function for any particular field. This is done by importing the config
function from dataclasses_json
and providing it as the metadata
keyword argument for field
. Next, include the decoder function as the decoder
keyword argument for config
.
Please see the updated example below. Using the schemaTypeDecoder
function, I am able to transform my data to the correct types.
from dataclasses import field
from dataclasses_json import config
class SchemaTypeName(Enum):
ARRAY = "array"
LONG = "long"
NULL = "null"
OBJECT = "object"
RECORD = "record"
STRING = "string"
def schemaTypeDecoder(data: Union[str, dict, List[Union[str, dict]]]):
def transform(schemaType: Union[str, dict]):
if isinstance(schemaType, str):
return SchemaTypeName(schemaType)
else:
return SchemaType.from_dict(schemaType)
if isinstance(data, list):
return [transform(schemaType) for schemaType in data]
else:
return transform(data)
@dataclass_json()
@dataclass
class SchemaType():
type: Union[
SchemaTypeName,
'SchemaType',
List[
Union[
SchemaTypeName,
'SchemaType'
]
]
] = field(
metadata=config(
decoder=schemaTypeDecoder
)
)
fields: Optional[List['SchemaType']] = None
name: Optional[str] = None