Using python 3.10.4
Hi all, I'm putting together a script where I'm reading a yaml file with k8s cluster info, and I'd like to treat the loaded yaml as dataclasses so I can reference them with .
properties.
Example yaml:
account: 12345
clusters:
- name: cluster_1
endpoint: https://cluster_2
certificate: abcdef
- name: cluster_1
endpoint: https://cluster_2
certificate: abcdef
And here's my script for loading and accessing it:
import yaml
from dataclasses import dataclass
@dataclass
class ClusterInfo:
_name: str
_endpoint: str
_certificate: str
@dataclass
class AWSInfo:
_account: int
_clusters: list[ClusterInfo]
clusters = yaml.safe_load(open('D:\git\clusters.yml', 'r'))
a = AWSInfo(
_account=clusters['account'],
_clusters=clusters['clusters']
)
print(a._account) #prints 12345
print(a._clusters) #prints the dict of both clusters
print(a._clusters[0]) #prints the dict of the first cluster
#These prints fails with AttributeError: 'dict' object has no attribute '_endpoint'
print(a._clusters[0]._endpoint)
for c in a._clusters:
print(c._endpoint)
So my question is: What am I doing wrong on the last prints? How can I access the properties of each member in a dataclass array of dataclass objects?
The dataclasses
module doesn't provide built-in support for this use case, i.e. loading YAML data to a nested class model.
In such a scenario, I would turn to a ser/de library such as dataclass-wizard
, which provides OOTB support for (de)serializing YAML data, via the PyYAML
library.
Disclaimer: I am the creator and maintener of this library.
Note: I will likely need to make this step easier for generating a dataclass model for YAML data. Perhaps worth creating an issue to look into as time allows. Ideally, usage is from the CLI, however since we have YAML data, it is tricky, because the utility tool expects JSON.
So easiest to do this in Python itself, for now:
from json import dumps
# pip install PyYAML dataclass-wizard
from yaml import safe_load
from dataclass_wizard.wizard_cli import PyCodeGenerator
yaml_string = """
account: 12345
clusters:
- name: cluster_1
endpoint: https://cluster_2
certificate: abcdef
- name: cluster_1
endpoint: https://cluster_2
certificate: abcdef
"""
py_code = PyCodeGenerator(experimental=True, file_contents=dumps(safe_load(yaml_string))).py_code
print(py_code)
Prints:
from __future__ import annotations
from dataclasses import dataclass
from dataclass_wizard import JSONWizard
@dataclass
class Data(JSONWizard):
"""
Data dataclass
"""
account: int
clusters: list[Cluster]
@dataclass
class Cluster:
"""
Cluster dataclass
"""
name: str
endpoint: str
certificate: str
YAMLWizard
Contents of my_file.yml
:
account: 12345
clusters:
- name: cluster_1
endpoint: https://cluster_5
certificate: abcdef
- name: cluster_2
endpoint: https://cluster_7
certificate: xyz
Python code:
from __future__ import annotations
from dataclasses import dataclass
from pprint import pprint
from dataclass_wizard import YAMLWizard
@dataclass
class Data(YAMLWizard):
account: int
clusters: list[Cluster]
@dataclass
class Cluster:
name: str
endpoint: str
certificate: str
data = Data.from_yaml_file('./my_file.yml')
pprint(data)
for c in data.clusters:
print(c.endpoint)
Result:
Data(account=12345,
clusters=[Cluster(name='cluster_1',
endpoint='https://cluster_5',
certificate='abcdef'),
Cluster(name='cluster_2',
endpoint='https://cluster_7',
certificate='xyz')])
https://cluster_5
https://cluster_7