Search code examples
pythonyamlpyyaml

Access nested elements in YAML and output ordered unique python list


I am trying to parse a yaml-file to output the nested child-elements into an ordered unique python list, which does not include duplicate values. My input yaml-file is:

# example.yml

name_1:
  parameters:
    - soccer
    - football
    - basketball
    - cricket
    - hockey
    - table tennis
  tag:
    - navigation
  assets:
    - url

name_2:
  parameters:
  - soccer
  - rugby
  - swimming
  examples:
  - use case 1
  - use case 2
  - use case 3

I managed to print out the first child of all the parents, which are:

['assets', 'examples', 'parameters', 'tag']

with the following code:

import yaml

with open(r'/Users/.../example.yml') as file:
    documents = yaml.full_load(file)

    a_list = []
    for item, doc in documents.items():
        a_list.extend(doc)
    res = list(set(a_list))
    res.sort()
    print(res)

I am struggling to extend the script to obtain the following ordered unique list below the parameters-element:

['basketball', 'cricket', 'football', 'hockey', 'rugby', 'soccer', 'swimming', 'table tennis']

Thanks in advance for any suggestions!


Solution

  • I was able to get this by iterating the parameters key -

    import yaml
    
    with open(r'example.yaml') as file:
        documents = yaml.full_load(file)
    
        a_list = []
        a_vals=[]
        for item, doc, in documents.items():
            for val in doc['parameters']:
                a_vals.append(val)
            a_list.extend(doc)
        res = list(set(a_list))
        res.sort()
        a_vals=list(set(a_vals))
        a_vals.sort()
        print(a_vals)
        print(res)
    

    Output -

    python.exe "pysuperclass.py"
    ['basketball', 'cricket', 'football', 'hockey', 'rugby', 'soccer', 'swimming', 'table tennis']
    ['assets', 'examples', 'parameters', 'tag']