Search code examples
pythonyamlpyyaml

How to access & modify the content of yaml file from python?


I would like to access and modify the content of a YAML file which looks like the following:

A: Sonstige
B:
  C: Sonstige
  D: null
  E: 1

I know that in order to access & modify the value of A in the above YAML file, I would use code like the following:

def set_state(state):
    with open('my_file.yaml') as f:
        doc = yaml.load(f)

    doc['A'] = state

    with open('my_file.yaml', 'w') as f:
        yaml.dump(doc, f)

But what if I would like to modify the value of E in the above YAML file? How can I access the value of E and modify its value and dump it in a YAML file, similar to the above code. I have gone over the reference docs and was unable to find an answer for this.


Solution

  • If you want to change the value for E, e.g. to 2 you could do the following:

    import yaml
    
    def set_state(state):
        with open('my_file.yaml') as f:
            doc = yaml.load(f)
    
        doc['B']['E'] = state
    
        with open('my_file.yaml', 'w') as f:
            yaml.dump(doc, f)
    
    set_state(2)
    

    To get to the value of E first index on B.

    There are multiple problems with this in my experience:

    1. You are using load() which in PyYAML is an unsafe operation without warning. Your data can be loaded with safe_load() and even if it couldn't it is much better to extend safe_load() to handle tags (and only those tags) than to extend load().

    2. Your output looks like:

      A: Sonstige
      B: {C: Sonstige, D: null, E: 2}
      

      not at all like your input, so use the option default_flow_style=False to get something with your original block-style layout.

    3. You don't need dump(), as data consisting of dicts, and primitives (and lists) can be dumped with safe_dump(). There is no safety issue there, but if you would call set_state() accidentally with a non-primitive:

      class Dice:
          def __init__(self, sides):
              self.sides = sides
      
      set_state(Dice(6))
      

      then dump() will silently generate the non-portable:

      A: Sonstige
      B:
        C: Sonstige
        D: null
        E: !!python/object:__main__.Dice {sides: 6}
      

      instead of raising an Representer error.

    4. Any comments in your original file will be lost. PyYAML does not preserve these.

    With PyYAML you should do:

    import yaml
    
    def set_state(state):
        with open('my_file.yaml') as f:
            doc = yaml.safe_load(f)
    
        doc['B']['E'] = state
    
        with open('my_file.yaml', 'w') as f:
            yaml.safe_dump(doc, f, default_flow_style=False)
    
    set_state(2)
    

    If you also want to preserve comments, or have a mixture of block-style and flow-style you want to preserve, I recommend you use ruamel.yaml (disclaimer: I am the author of that package):

    import pathlib
    from ruamel.yaml import YAML
    
    def set_state(state):
        yaml = YAML()
        mf = pathlib.Path('my_file.yaml')
        doc = yaml.load(mf)
    
        doc['B']['E'] = state
    
        yaml.dump(doc, mf)
    

    This has the same result as before, with the load() method being safe by default, preserving your layout (plus any comments in the YAML file you might have) and opening the Path instance for reading or writing as necessary (so you don't need the with statement, nor need the double action of calling dump and provide the w to open()).