Search code examples
pythontagspyyaml

Creating Custom Tag in PyYAML


I'm trying to use Python's PyYAML to create a custom tag that will allow me to retrieve environment variables with my YAML.

import os
import yaml

class EnvTag(yaml.YAMLObject):
    yaml_tag = u'!Env'

    def __init__(self, env_var):
       self.env_var = env_var

    def __repr__(self):
       return os.environ.get(self.env_var)

settings_file = open('conf/defaults.yaml', 'r')
settings = yaml.load(settings_file)

And inside of defaults.yaml is simply:

example: !ENV foo

The error I keep getting:

yaml.constructor.ConstructorError: 
could not determine a constructor for the tag '!ENV' in 
"defaults.yaml", line 1, column 10

I plan to have more than one custom tag as well (assuming I can get this one working)


Solution

  • Your PyYAML class had a few problems:

    1. yaml_tag is case sensitive, so !Env and !ENV are different tags.
    2. So, as per the documentation, yaml.YAMLObject uses meta-classes to define itself, and has default to_yaml and from_yaml functions for those cases. By default, however, those functions require that your argument to your custom tag (in this case !ENV) be a mapping. So, to work with the default functions, your defaults.yaml file must look like this (just for example) instead:

    example: !ENV {env_var: "PWD", test: "test"}

    Your code will then work unchanged, in my case print(settings) now results in {'example': /home/Fred} But you're using load instead of safe_load -- in their answer below, Anthon pointed out that this is dangerous because the parsed YAML can overwrite/read data anywhere on the disk.

    You can still easily use your YAML file format, example: !ENV foo—you just have to define an appropriate to_yaml and from_yaml in class EnvTag, ones that can parse and emit scalar variables like the string "foo".

    So:

    import os
    import yaml
    
    class EnvTag(yaml.YAMLObject):
        yaml_tag = u'!ENV'
    
        def __init__(self, env_var):
            self.env_var = env_var
    
        def __repr__(self):
            v = os.environ.get(self.env_var) or ''
            return 'EnvTag({}, contains={})'.format(self.env_var, v)
    
        @classmethod
        def from_yaml(cls, loader, node):
            return EnvTag(node.value)
    
        @classmethod
        def to_yaml(cls, dumper, data):
            return dumper.represent_scalar(cls.yaml_tag, data.env_var)
    
    # Required for safe_load
    yaml.SafeLoader.add_constructor('!ENV', EnvTag.from_yaml)
    # Required for safe_dump
    yaml.SafeDumper.add_multi_representer(EnvTag, EnvTag.to_yaml)
    
    settings_file = open('defaults.yaml', 'r')
    
    settings = yaml.safe_load(settings_file)
    print(settings)
    
    s = yaml.safe_dump(settings)
    print(s)
    

    When this program is run, it outputs:

    {'example': EnvTag(foo, contains=)}
    {example: !ENV 'foo'}
    

    This code has the benefit of (1) using the original pyyaml, so nothing extra to install and (2) adding a representer. :)