Search code examples
pythonjsonyamlpyyaml

YAML structured on key/value format to JSON


I have this yaml file below. I need to keep this structure of the yaml file. On the file there is the peculiarity of having some nested key/values (like on personalinfo and more) and some nested key without value (like on extra).
The number of key/value is arbitrary, as the number of parents is also arbitrary (on this case of the example the parents are personalInfo, more, extra) .

name: person1
personalInfo:
  - key: address
    value: street 1
  - key: age
    value: 10
  - key: school
    value: XYZ
more:
  - key: mother
    value: Samantha
extra:
 - key: a
 - key: b

From this yaml I want to generate a Json formatted as given below but I don't know how to achieve this.

'{"personalInfo" : {"address": "street 1", "age": "10", "school": "XYZ"}, "more":{"mother": "Samantha"}, "extra": ["a", "b"]}' "localhost:8080/person1"

Solution

  • The easiest and purest method is PyYaml which can be installed via pip install pyyaml. A plain yaml.load() function exists, but yaml.safe_load() should always be preferred unless you explicitly need the arbitrary object serialization/deserialization provided in order to avoid introducing the possibility for arbitrary code execution.

    import yaml
    with open("test.yml", 'r') as stream:
        try:
            data = yaml.safe_load(stream)
        except yaml.YAMLError as exc:
            print(exc)
    

    This returns:

    {'name': 'person1',
     'personalInfo': [{'key': 'address', 'value': 'street 1'},
      {'key': 'age', 'value': 10},
      {'key': 'school', 'value': 'XYZ'}],
     'more': [{'key': 'mother', 'value': 'Samantha'}],
     'extra': [{'key': 'a'}, {'key': 'b'}]}
    

    And you can try this to get your desired result:

    import json
    
    new_data = {}
    
    for i in data:
        if i!='name':    
            temp = {}
            val = []
            for k in data[i]:
                if 'key' in k:
                    try:
                        temp[k['key']] = k['value']
                    except:
                        val.append(k['key'])
            if val:
                new_data[i] = val
            elif temp:
                new_data[i] = temp
            else:
                new_data[i] = data[i]
    
    str1 = json.dumps(new_data)
    str2 = "localhost:8080/"+data['name']
    
    with open("sample.json", "w") as outfile:
        json.dump(str1.replace('"', "'"), outfile)
        json.dump(str2.replace('"', "'"), outfile)
    

    Result in sample.json

    "{'personalInfo': {'address': 'street 1', 'age': 10, 'school': 'XYZ'}, 'more': {'mother': 'Samantha'}, 'extra': ['a', 'b']}""localhost:8080/person1"