I am trying to parse a yaml file - https://github.com/open-telemetry/opentelemetry-specification/blob/master/semantic_conventions/resource/cloud.yaml
I am using the following code
with open('cloud.yaml') as f:
my_dict = yaml.safe_load(f)
print(my_dict)
Which generates the following dictionary
{'groups': [{'id': 'cloud', 'prefix': 'cloud', 'brief': 'A cloud infrastructure (e.g. GCP, Azure, AWS)\n', 'attributes': [{'id': 'provider', 'type': {'allow_custom_values': True, 'members': [{'id': 'AWS', 'value': 'aws', 'brief': 'Amazon Web Services'}, {'id': 'Azure', 'value': 'azure', 'brief': 'Microsoft Azure'}, {'id': 'GCP', 'value': 'gcp', 'brief': 'Google Cloud Platform'}]}, 'brief': 'Name of the cloud provider.\n', 'examples': 'gcp'}, {'id': 'account.id', 'type': 'string', 'brief': 'The cloud account ID used to identify different entities.\n', 'examples': ['opentelemetry']}, {'id': 'region', 'type': 'string', 'brief': 'A specific geographical location where different entities can run.\n', 'examples': ['us-central1']}, {'id': 'zone', 'type': 'string', 'brief': 'Zones are a sub set of the region connected through low-latency links.\n', 'note': 'In AWS, this is called availability-zone.\n', 'examples': ['us-central1-a']}]}]}
I want to iterate through the elements and extract the following values
I am trying to go through all key values using below code
for groups in my_dict.values():
print(groups)
Output is
[{'id': 'cloud', 'prefix': 'cloud', 'brief': 'A cloud infrastructure (e.g. GCP, Azure, AWS)\n', 'attributes': [{'id': 'provider', 'type': {'allow_custom_values': True, 'members': [{'id': 'AWS', 'value': 'aws', 'brief': 'Amazon Web Services'}, {'id': 'Azure', 'value': 'azure', 'brief': 'Microsoft Azure'}, {'id': 'GCP', 'value': 'gcp', 'brief': 'Google Cloud Platform'}]}, 'brief': 'Name of the cloud provider.\n', 'examples': 'gcp'}, {'id': 'account.id', 'type': 'string', 'brief': 'The cloud account ID used to identify different entities.\n', 'examples': ['opentelemetry']}, {'id': 'region', 'type': 'string', 'brief': 'A specific geographical location where different entities can run.\n', 'examples': ['us-central1']}, {'id': 'zone', 'type': 'string', 'brief': 'Zones are a sub set of the region connected through low-latency links.\n', 'note': 'In AWS, this is called availability-zone.\n', 'examples': ['us-central1-a']}]}]
I wanted to print all values individually, example - cloud, A cloud infrastructure (e.g. GCP, Azure, AWS)\n etc
Output I need is to print below values:
cloud, A cloud infrastructure (e.g. GCP, Azure, AWS).
cloud.provider,, Name of the cloud provider.
cloud.provider.member, AWS, Amazon Web Services
cloud.provider.member, azure, Microsoft Azure
cloud.provider.member, GCP, Google Cloud Platform
cloud.account.id, string, The cloud account ID used to identify different entities.
cloud.region, string, A specific geographical location where different entities can run.
.
.
.
.
Also can be implemented in a generic way, verifying if the value in 'type' is a dict instance:
Supposing that the variable parsed_dict has the result after parse the jaml file:
def remove_end_of_line_char(line_text):
if len(line_text) > 0 and line_text[-1] == '\n':
line_text = line_text[:-1]
return line_text
data_groups = parsed_dict["groups"]
for group in data_groups:
msg = remove_end_of_line_char(f"{group['id']}, {group['brief']}")
print(msg)
attributes_list = group["attributes"]
for attribute in attributes_list:
attr_type = attribute['type']
if isinstance(attr_type, dict):
print(f"{group['id']}.{attribute['id']},, {remove_end_of_line_char(attribute['brief'])}")
cloud_provider_member_prefix = f"{group['id']}.{attribute['id']}.member, "
for member in attr_type['members']:
print(f"{cloud_provider_member_prefix}{member['id']}, {member['brief']}")
else:
msg = remove_end_of_line_char(f"{group['id']}.{attribute['id']}, {attribute['type']}, {attribute['brief']}")
print(msg)