Search code examples
pythonyamlruntime-errorruamel.yaml

How to use ruamel.yaml.round_trip_load to preserve quotes and not get error for duplicate keys?


I have the following test.yaml document:

components:
  schemas:
    description: 'ex1'
    description: 'ex2'

and the following python script that reads the yaml:

import ruamel.yaml

ruamel.yaml.YAML().allow_duplicate_keys = True
def read_yaml_file(filename: str):
    with open(filename, 'r') as stream:
       
        my_yaml = ruamel.yaml.round_trip_load(stream, preserve_quotes=True)
        return my_yaml
      
my_yaml = read_yaml_file('test.yaml')

Question How to get past the following error? (I don't mind if the keys are overwritten)

ruamel.yaml.constructor.DuplicateKeyError: while constructing a mapping
  in "test.yaml", line 3, column 5
found duplicate key "description" with value "ex2" (original value: "ex1")
  in "test.yaml", line 4, column 5

Solution

  • You're mixing the new style API with the old (PyYAML) style API, you shouldn't. The YAML() instance on which you set .allow_duplicate_keys is immediately destroyed the way you do things, you should use its .load() method to load the non-YAML you have as input:

    import sys
    import ruamel.yaml
    
    yaml_str = """\
    components:
      schemas:
        description: 'ex1'
        description: 'ex2'
    """
    
    yaml = ruamel.yaml.YAML()
    yaml.allow_duplicate_keys = True
    yaml.preserve_quotes = True
    # yaml.indent(mapping=4, sequence=4, offset=2)
    data = yaml.load(yaml_str)
    yaml.dump(data, sys.stdout)
    

    which gives:

    components:
      schemas:
        description: 'ex1'
    

    This means the value for key description doesn't get overwritten, instead the first value in the file, for that key, is preserved.