Search code examples
pythonyamlpuppet

Parsing puppet-api yaml with python


I am creating a script which need to parse the yaml output that the puppet outputs.

When I does a request agains example https://puppet:8140/production/catalog/my.testserver.no I will get some yaml back that looks something like:

--- &id001 !ruby/object:Puppet::Resource::Catalog
  aliases: {}
  applying: false
  classes: 
    - s_baseconfig
    ...
  edges: 
    - &id111 !ruby/object:Puppet::Relationship
      source: &id047 !ruby/object:Puppet::Resource
        catalog: *id001
        exported: 

and so on... The problem is when I do an yaml.load(yamlstream), I will get an error like:

yaml.constructor.ConstructorError: could not determine a constructor for the tag '!ruby/object:Puppet::Resource::Catalog'
 in "<string>", line 1, column 5:
   --- &id001 !ruby/object:Puppet::Reso ... 
       ^

As far as I know, this &id001 part is supported in yaml.

Is there any way around this? Can I tell the yaml parser to ignore them? I only need a couple of lines from the yaml stream, maybe regex is my friend here? Anyone done any yaml cleanup regexes before?

You can get the yaml output with curl like:

curl --cert /var/lib/puppet/ssl/certs/$(hostname).pem --key /var/lib/puppet/ssl/private_keys/$(hostname).pem --cacert /var/lib/puppet/ssl/certs/ca.pem -H 'Accept: yaml' https://puppet:8140/production/catalog/$(hostname)

I also found some info about this in the puppet mailinglist @ http://www.mail-archive.com/[email protected]/msg24143.html. But I cant get it to work correctly...


Solution

  • I have emailed Kirill Simonov, the creator of PyYAML, to get help to parse Puppet YAML file.

    He gladly helped with the following code. This code is for parsing Puppet log, but I'm sure you can modify it to parse other Puppet YAML file.

    The idea is to create the correct loader for the Ruby object, then PyYAML can read the data after that.

    Here goes:

    #!/usr/bin/env python
    
    import yaml
    
    def construct_ruby_object(loader, suffix, node):
        return loader.construct_yaml_map(node)
    
    def construct_ruby_sym(loader, node):
        return loader.construct_yaml_str(node)
    
    yaml.add_multi_constructor(u"!ruby/object:", construct_ruby_object)
    yaml.add_constructor(u"!ruby/sym", construct_ruby_sym)
    
    
    stream = file('201203130939.yaml','r')
    mydata = yaml.load(stream)
    print mydata