Search code examples
python-3.xpyyamlruamel.yaml

dump the yaml to a variable instead of streaming it in stdout using ruamel.yaml


I am not able to find a way to dump the YAML to a variable in ruamel.yaml. With PyYAML, I could able to achieve the same like below:

with open('oldPipeline.yml','r') as f:
    data = yaml.load(f, Loader=yaml.FullLoader)

a = yaml.dump(data)

But when I try the same with ruamel.yaml, it throws the exception TypeError: Need a stream argument when not dumping from context manager


Solution

  • The error clearly indicates you should provide a stream, so you should just do that:

    import io
    import ruamel.yaml
    
    yaml_str = b"""\
    fact: The recommended YAML file extension has been .yaml since September 2006
    origin: yaml.org FAQ
    """
    
    yaml = ruamel.yaml.YAML()
    data = yaml.load(yaml_str)
    buf = io.BytesIO()
    yaml.dump(data, buf)
    assert buf.getvalue() == yaml_str
    

    which gives no assertion error.

    ruamel.yaml and PyYAML use a stream interface and writing this to a buffer is seldom necessary and should be avoided because of its inefficiency, especially in the often seen PyYAML form of:

    print(yaml.dump(data))  # inefficient wrt time and space
    

    instead of the more appropriate

    yaml.dump(data, sys.stdout)
    

    Post-processing of the the output should preferably be done in a stream like object, or by using the transform option described in the Basic Usage section of the documentation.

    The is further explained here, and of course you could also have looked up how PyYAML does this.