Search code examples
yamlruamel.yaml

Using a custom Loader with ruamel.yaml 0.15.0


Ruamel.yaml is changing its API: https://yaml.readthedocs.io/en/latest/api.html#loading

And this seems to remove the support for custom loaders. At least it seems like it. It's unclear how to use custom loaders with the latest API.

For reference my custom loader is:


import io
import os
import ubelt as ub


@ub.memoize
def _custom_ruaml_loader():
    """
    References:
        https://stackoverflow.com/questions/59635900/ruamel-yaml-custom-commentedmapping-for-custom-tags
        https://stackoverflow.com/questions/528281/how-can-i-include-a-yaml-file-inside-another
    """
    import ruamel.yaml
    Loader = ruamel.yaml.RoundTripLoader

    def _construct_include_tag(self, node):
        print(f'node={node}')
        if isinstance(node.value, list):
            return [Yaml.coerce(v.value) for v in node.value]
        else:
            external_fpath = ub.Path(node.value)
            if not external_fpath.exists():
                raise IOError(f'Included external yaml file {external_fpath} '
                              'does not exist')
            return Yaml.load(node.value)
    Loader.add_constructor("!include", _construct_include_tag)
    return Loader


Loader = _custom_ruaml_loader()
data = ruamel.yaml.load(file, Loader=Loader, preserve_quotes=True)

The warning states that I should do something like:

from ruamel.yaml import YAML
yaml = YAML(typ='unsafe', pure=True)
data = yaml.load(file)

But I don't see how I can use a custom Loader with this new API.

Has support for custom Loaders been removed, or is there just a new way to use them?


Solution

  • You don't have a custom loader, you have the standard RoundTripLoader for which you extended the number of tags it can process by adding a constructor for the tag !include. The name _custom_ruaml_loader is misleading, just include print(type(Loader)) at the end of the file and you'll see you have an instance of the normal RoundTripLoader.

    It is unclear where Yaml comes from and what is the significance of you using ubelt, so I can't comment on that. Apart from that I'll assume you have a file a.yaml looking like:

    a: 'xyz'
    b: !include b.yaml   # or [b.yaml] but I have no idea what Yaml.coerce does with that
    

    and a file b.yaml:

    c: !include c.yaml
    d: 42
    

    and a file c.yaml:

    e: 2011-10-02
    f: enough
    

    You can still add constructors for additional tags, and you can also still add custom constructors.

    The constructors are added on a RoundTripConstructor/SafeConstructor/Constructor, or on a subclass of those ( a Loader is nothing else but a composite of the various stages of YAML loading, including a Constructor or one of its subclasses):

    import sys
    import pathlib
    import ruamel.yaml
    
    def _construct_include_tag(constructor, node):
        yaml = ruamel.yaml.YAML()  # make a new instance, although you could get the YAML
                                   # instance from the constructor argument
        external_fpath = Path(node.value)
        if not external_fpath.exists():
            raise IOError(f'Included external yaml file {external_fpath} '
                            'does not exist')
        res = yaml.load(external_fpath)
        return res
    
    ruamel.yaml.constructor.RoundTripConstructor.add_constructor('!include', _construct_include_tag)
    file_in = Path('a.yaml')
    yaml = ruamel.yaml.YAML()
    yaml.preserve_quotes = True
    
    data = yaml.load(file_in)
    yaml.dump(data, sys.stdout)
    

    which gives:

    a: 'xyz'
    b:
      c:
        e: 2011-10-02
        f: enough
      d: 42
    

    Please note that ruamel.yaml can round-trip a.yaml, without an added constructor but that will not do anything special with !include, apart from preserving it.

    As you can see you don't need a custom constructor. But the above affects all files loaded with the RoundTripConstructor, in this case including the loading of b.yaml as a result of encountering !include. And if you don't want that, you can add your own Constructor subclass and add the constructor for tag '!include' to that, without affecting the RoundTripConstructor:

    import sys
    import pathlib
    import ruamel.yaml
    
    class IncludeConstructor(ruamel.yaml.RoundTripConstructor):
        pass
    
    def _construct_include_tag(constructor, node):
        yaml = ruamel.yaml.YAML()  # make a new instance, although you could get the YAML
                                   # instance from the constructor argument
        external_fpath = Path(node.value)
        if not external_fpath.exists():
            raise IOError(f'Included external yaml file {external_fpath} '
                            'does not exist')
        res = yaml.load(external_fpath)
        return res
    
    IncludeConstructor.add_constructor('!include', _construct_include_tag)
    
    file_in = Path('a.yaml')
    yaml = ruamel.yaml.YAML()
    yaml.Constructor = IncludeConstructor
    yaml.preserve_quotes = True
    
    data = yaml.load(file_in)
    yaml.dump(data, sys.stdout)
    

    which gives:

    a: 'xyz'
    b:
      c: !include c.yaml
      d: 42
    

    because the default RoundTripLoader is not affected the instance of YAML() in the function _construct_include_tag is not affected and c.yaml is never loaded.

    This was done with ruamel.yaml==0.17.32 with Python 3.11.4 on macOS.