Search code examples
pythonyamlpyyaml

Pyyaml: Modify aws-auth-cm.yaml, preserve the multi line string


I am trying to load a YAML file in python, modify it and dump it back. The YAML looks like this:

data:
  mapRoles: |
    - username: system:node:{{EC2PrivateDNSName}}
      groups:
      - system:bootstrappers
      - system:nodes
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system

I would like to modify it so that the output file includes a new line rolearn: awsarn in mapRoles:

data:
  mapRoles: |
    - username: system:node:{{EC2PrivateDNSName}}
      groups:
        - system:bootstrappers
        - system:nodes
      rolearn: awsarn
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system

But I am getting an output that where the mapRoles value get quoted like a string and includes the literal \n:

apiVersion: v1
data:
  mapRoles: "- username: system:node:{{EC2PrivateDNSName}}\n  groups:\n    - system:bootstrappers\n\
    \    - system:nodes\n  rolearn: arnaws"
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system

The code I am using:

with open ('/tmp/aws-auth-cm.yaml') as f:
  content = yaml.safe_load(f)
  content['data']['mapRoles'] = content['data']['mapRoles'] + '  rolearn: awsarn' 
with open("/tmp/aws-auth-cm.yaml", "w") as f:
  yaml.safe_dump(content, f, default_flow_style=False)

I also tried using yaml.safe_dump(content, f, default_flow_style=False, default_style='|') then all the values have |- and the keys get quoted with double quotes:

"apiVersion": |-
  v1
"data":
  "mapRoles": |-
    - username: system:node:{{EC2PrivateDNSName}}
      groups:
        - system:bootstrappers
        - system:nodes
      rolearn: arnaws
"kind": |-
  ConfigMap
"metadata":
  "name": |

Is there a way that I could apply the style='|' just to the strings and also make sure the keys are not quoted?


Solution

  • You can try to do this with PyYAML, but you'll need to load the block style literal scalar (that is what such a multi-line construct indicated by | is called) in a subclass of string, make sure you can modify it, then on dumping use a special representer for that subclass that again dumps as a literal scalar.

    The easy way to accomplish this is upgrading from PyYAML to ruamel.yaml (disclaimer: I am the author of that package), not only does it preserve the literal block it also supports the more up-to-date YAML 1.2 spec (issued 2009), preserves comments and tags, integer and float formats, and (optionally) superfluous quotes:

    import sys
    import ruamel.yaml
    
    
    yaml = ruamel.yaml.YAML()
    with open('aws-auth-cm.yaml') as f:
        content = yaml.load(f)
    content['data']['mapRoles'] += '  rolearn: awsarn\n'
    with open('aws-auth-cm.yaml', 'w') as f:
        yaml.dump(content, f)
    

    which gives:

    data:
      mapRoles: |
        - username: system:node:{{EC2PrivateDNSName}}
          groups:
          - system:bootstrappers
          - system:nodes
          rolearn: awsarn
    kind: ConfigMap
    metadata:
      name: aws-auth
      namespace: kube-system
    

    Please note that I saved some typing by using += to change the "string" loaded from the literal scalar. And that I added a newline to the end of the added string, because otherwise your literal scalar would be introduced with |-, the - being the block chomping operator indicating stripping.