Search code examples
pythonyamlreformatting

Tool to automatically expand YAML merges?


I'm looking for a tool or process which can easily take a YAML file which contains anchors, aliases and merge keys and expand the aliases and merges out into a flat YAML file. There are still many commonly used YAML parses which don't fully support merging.

I'd like to be able to take advantage of merging to keep things DRY, but there are instances where this needs to then be built into a more verbose "flat" YAML file so that it can be used by other tooling which relies on incomplete YAML parsers.

Example Source YAML:

default: &DEFAULT
  URL: website.com
  mode: production  
  site_name: Website
  some_setting: h2i8yiuhef
  some_other_setting: 3600

development:
  <<: *DEFAULT
  URL: website.local
  mode: dev

test:
  <<: *DEFAULT
  URL: test.website.qa
  mode: test

Desired output YAML:

default:
  URL: website.com
  mode: production  
  site_name: Website
  some_setting: h2i8yiuhef
  some_other_setting: 3600

development:
  URL: website.local
  mode: dev
  site_name: Website
  some_setting: h2i8yiuhef
  some_other_setting: 3600

test:
  URL: test.website.qa
  mode: test
  site_name: Website
  some_setting: h2i8yiuhef
  some_other_setting: 3600

Solution

  • If you have python installed on your system, you can do pip install ruamel.yaml.cmd¹ and then:

    yaml merge-expand input.yaml output.yaml
    

    (replace output.yaml with - to write to stdout). This implements the merge expanding with preservation of key order and comments.

    The above is actually a few lines of code that utilizes ruamel.yaml¹ so if you have Python (2.7 or 3.4+) and install that using pip install ruamel.yaml and save the following as expand.py:

    import sys
    from ruamel.yaml import YAML
    
    yaml = YAML(typ='safe')
    yaml.default_flow_style=False
    with open(sys.argv[1]) as fp:
        data = yaml.load(fp)
    with open(sys.argv[2], 'w') as fp:
        yaml.dump(data, fp)
    

    you can already do:

    python expand.py input.yaml output.yaml
    

    That will get you YAML that is semantically equivalent to what you requested (in output.yaml the keys of the mappings are sorted, in this programs output they are not).

    The above assumes you don't have any tags in your YAML, nor care about preserving any comments. Most of those, and the key ordering, can be preserved by using a patched version of the standard YAML() instance. Patching is necessary because the standard YAML() instance preserves the merges on round-trip as well, which is exactly what you don't want:

    import sys
    from ruamel.yaml import YAML, SafeConstructor
    
    yaml = YAML()
    
    yaml.Constructor.flatten_mapping = SafeConstructor.flatten_mapping
    yaml.default_flow_style=False
    yaml.allow_duplicate_keys = True
    # comment out next line if you want "normal" anchors/aliases in your output
    yaml.representer.ignore_aliases = lambda x: True  
    
    with open(sys.argv[1]) as fp:
        data = yaml.load(fp)
    with open(sys.argv[2], 'w') as fp:
        yaml.dump(data, fp)
    

    with this input:

    default: &DEFAULT
      URL: website.com
      mode: production
      site_name: Website
      some_setting: h2i8yiuhef
      some_other_setting: 3600  # an hour?
    
    development:
      <<: *DEFAULT
      URL: website.local     # local web
      mode: dev
    
    test:
      <<: *DEFAULT
      URL: test.website.qa
      mode: test
    

    that will give this output (note that comments on the merged in keys get duplicated):

    default:
      URL: website.com
      mode: production
      site_name: Website
      some_setting: h2i8yiuhef
      some_other_setting: 3600  # an hour?
    
    development:
      URL: website.local     # local web
      mode: dev
    
      site_name: Website
      some_setting: h2i8yiuhef
      some_other_setting: 3600  # an hour?
    
    test:
      URL: test.website.qa
      mode: test
      site_name: Website
      some_setting: h2i8yiuhef
      some_other_setting: 3600  # an hour?
    

    The above is what the yaml merge-expand command, mentioned at the start of this answer, does.


    ¹ Disclaimer: I am the author of that package.