Search code examples
pythonyamlpyyamlruamel.yaml

Edit existing yaml file but keeping original comments


I am trying to create a Python script that will convert our IPtables config to firewall multi within a YAML file. I originally was using pyyaml, however, later found that this removes all comments which I will need to keep, I found that ruamel.yaml can be used to keep comments, however, I am struggling to get this to work.

import sys
import io
import string
from collections import defaultdict
import ruamel.yaml 


#loading the yaml file 

try:
      config = ruamel.yaml.round_trip_load(open('test.yaml'))
except ruamel.yaml.YAMLError as exc:
      print(exc)

print (config)

# Output class
#this = defaultdict(list)
this = {}
rule_number = 200
iptables_key_name = "ha247::firewall_addrule::firewall_multi"


# Do stuff here
for key, value in config.items():
 # Maipulate iptables rules only
   if key == 'iptables::rules':

# Set dic withim iptables_key_name
     this[iptables_key_name] = {}
     for rule, rule_value in value.items():

# prefix rule with ID
         new_rule =("%s %s" % (rule_number,rule))
         rule_number = rule_number + 1



# Set dic within [iptables_key_name][rule]
         this[iptables_key_name][new_rule] = {}
# Ensure we have action
         this[iptables_key_name][new_rule]['action'] = 'accept'
         for b_key, b_value in rule_value.items():
# Change target to action as rule identifier
             b_key = b_key.replace('target','action')
# Save each rule and ensure we are lowrcase
             this[iptables_key_name][new_rule][b_key] = str(b_value).lower()

  elif key == 'ha247::security::enable': 
      this['ha247::security_firewall::enable'] = value

  elif key == 'iptables::safe_ssh':
      this['ha247::security_firewall::safe_ssh'] = value

  else:
# Print to yaml
     this[key] = value


# Write YAML file
  with io.open('result.yaml', 'w', encoding='utf8') as outfile:
       ruamel.yaml.round_trip_dump(this, outfile, default_flow_style=False, allow_unicode=True)

The input file (test.yaml)

---

# Enable default set of security rules


# Configure firewall
iptables::rules:
 ACCEPT_HTTP:
    port: '80'
 HTTPS:
    port: '443'

# Configure the website
simple_nginx::vhosts:
    <doamin>:
     backend: php-fpm
     template: php-magento-template
     server_name: 
     server_alias: www.
     document_root: /var/www/
     ssl_enabled: true
     ssl_managed_enabled: true
     ssl_managed_name: www.
     force_www: true

The output of result.yaml

ha247::firewall_addrule::firewall_multi:
  200 ACCEPT_HTTP:
    action: accept
    port: '80'
  201 HTTPS:
    action: accept
    port: '443'

ha247::security_firewall::enable: true
ha247::security_firewall::safe_ssh: false
simple_nginx::ssl_ciphers:     
simple_nginx::vhosts:
 <domain>:
    backend: php-fpm
    document_root: /var/www/
    force_www: true
    server_alias: www.
    server_name: .com
    ssl_enabled: true
    ssl_managed_enabled: true
    ssl_managed_name: www.
    template: php-magento-template

This is where the problem lies, as you can see it has changed all the formatting and deleted comments which we need to keep, another issue is it has removed the three hyphens at the top which will for configuration manager unable to read the file.


Solution

  • You cannot exactly get what you want because you indent mappings inconsistently, as the indent for you mappings are 1, 2, 3, and 4 positions. As documented, ruamel.yaml has only one setting applied to all mappings (which defaults to 2).

    Currently document start (and end) markers are not analysed on input, so you'll have to do some minimal extra work.

    The biggest problem however is your misconception of what it means to use the round-trip loader and dumper. It is meant to load a YAML document into a Python data structure, change that data structure and then write out that same data structure. You create a new data structure out of the blue (this), assign some values from a YAML loaded data-structure (config) and then write out that new data structure (this). From your call to print(), you see you are loading a CommentedMap as the root data structure, and your normal Python dict of course doesn't know about any comments you might have loaded and that are attached to config.

    So first look at what you would get with a minimal program that loads and dumps your input file without changing anything (explicitly). I will be using the new API, and recommend you do so too, although you probably can get this done with the old API as well. In the new API allow_unicode is default True.

    import sys
    from ruamel.yaml import YAML
    
    yaml = YAML()
    yaml.explicit_start = True
    yaml.indent(mapping=3)
    yaml.preserve_quotes = True  # not necessary for your current input
    
    with open('test.yaml') as fp:
        data = yaml.load(fp)
    yaml.dump(data, sys.stdout)
    

    Which gives:

    ---
    
    # Enable default set of security rules
    
    
    # Configure firewall
    iptables::rules:
       ACCEPT_HTTP:
          port: '80'
       HTTPS:
          port: '443'
    
    # Configure the website
    simple_nginx::vhosts:
       <doamin>:
          backend: php-fpm
          template: php-magento-template
          server_name:
          server_alias: www.
          document_root: /var/www/
          ssl_enabled: true
          ssl_managed_enabled: true
          ssl_managed_name: www.
          force_www: true
    

    And that only differs from your input test.yaml in having consistent indentation (i.e. diff -b gives no differences).


    Your code doesn't actually work (syntax error because of indentation) and if it did, it is not clear where the

    ha247::security_firewall::enable: true
    ha247::security_firewall::safe_ssh: false
    simple_nginx::ssl_ciphers:   
    

    in the output come from, nor how <doamin> gets changed in <domain> (you are doing something fishy there for real, as otherwise the keys in the value for <domain> would not magically get sorted.

    Assuming as input test.yaml:

    ---
    
    # Enable default set of security rules
    
    
    # Configure firewall
    iptables::rules:
     ACCEPT_HTTP:
        port: '80'
     HTTPS:
        port: '443'
    
    ha247::security::enable: true         # EOL Comment
    iptables::safe_ssh: false
    simple_nginx::ssl_ciphers:
    # Configure the website
    simple_nginx::vhosts:
        <doamin>:
         backend: php-fpm
         template: php-magento-template
         server_name:
         server_alias: www.
         document_root: /var/www/
         ssl_enabled: true
         ssl_managed_enabled: true
         ssl_managed_name: www.
         force_www: true
    

    and the following program:

    import sys
    from ruamel.yaml import YAML
    
    yaml = YAML()
    yaml.explicit_start = True
    yaml.indent(mapping=3)
    yaml.preserve_quotes = True  # not necessary for your current input
    
    with open('test.yaml') as fp:
        data = yaml.load(fp)
    
    
    key_map = {
        'iptables::rules': ['ha247::firewall_addrule::firewall_multi', None, 200],
        'ha247::security::enable': ['ha247::security_firewall::enable', None],
        'iptables::safe_ssh': ['ha247::security_firewall::safe_ssh', None],
    }
    
    for idx, key in enumerate(data):
        if key in key_map:
            key_map[key][1] = idx
    
    rule_number = 200
    
    for key in key_map:
        km_val = key_map[key]
        if km_val[1] is None:  # this is the index in data, if found
            continue
        # pop the value and reinsert it in the right place with the new name
        value = data.pop(key)
        data.insert(km_val[1], km_val[0], value)
        # and move the key related comments
        data.ca._items[km_val[0]] = data.ca._items.pop(key, None)
        if key == 'iptables::rules':
            data[km_val[0]] = xd = {}  # normal dict nor comments preserved
            for rule, rule_value in value.items():
                new_rule = "{} {}".format(rule_number, rule)
                rule_number += 1
                xd[new_rule] = nr = {}
                nr['action'] = 'accept'
                for b_key, b_value in rule_value.items():
                    b_key = b_key.replace('target', 'action')
                    nr[b_key] = b_value.lower() if isinstance(b_value, str) else b_value
    
    
    yaml.dump(data, sys.stdout)
    

    you get:

    ---
    
    # Enable default set of security rules
    
    
    # Configure firewall
    ha247::firewall_addrule::firewall_multi:
       200 ACCEPT_HTTP:
          action: accept
          port: '80'
       201 HTTPS:
          action: accept
          port: '443'
    
    ha247::security_firewall::enable: true # EOL Comment
    ha247::security_firewall::safe_ssh: false
    simple_nginx::ssl_ciphers:
    # Configure the website
    simple_nginx::vhosts:
       <doamin>:
          backend: php-fpm
          template: php-magento-template
          server_name:
          server_alias: www.
          document_root: /var/www/
          ssl_enabled: true
          ssl_managed_enabled: true
          ssl_managed_name: www.
          force_www: true
    

    Which should be a good basis to start from.

    Please note that I used .format() instead of the old fashioned % formatting. I also only lowercase b_value if it is a string, your code would e.g. convert an integer to a string and that would lead to quotes in your output where there would be none to start with.