Search code examples
javaregexpuppetfacter

I need a regex to match puppet facter facts


Puppet facts look like this:

processors => {"models"=>["AMD Opteron(tm) Processor 6172", "AMD Opteron(tm) Processor 6172", "AMD Opteron(tm) Processor 6172", "AMD Opteron(tm) Processor 6172"], "count"=>4, "physicalcount"=>2}
productname => VMware Virtual Platform
ps => ps -ef
puppetversion => 3.6.2
rubysitedir => /usr/local/brs/harmony-puppet/lib/ruby/site_ruby/2.1.0
rubyversion => 2.1.2
sshecdsakey => AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBNDUmg8FQGCO/r/VGABUPwBqT8zTwzXwZCjTdBC6cXj1Mo5ypxuqO1Qtwg9uQagcS5eLNbv+SxHotpzYSXZ1R8g=
sshfp_dsa => SSHFP 2 1 42ffbd293f1501c0718b2b7b3852542329da1758
SSHFP 2 2 eb52d78a34bdadecc41b38366a5580c923bbb6cd0b81cec76de6379ce4251439
sshfp_ecdsa => SSHFP 3 1 d41abd2e3aff846b4efb59dbc8e4803875d33130
SSHFP 3 2 ae77a20a66859976e06efb7d6dd0819db4f9e9d93bc55da52a4bffff6acb1baa
sshfp_rsa => SSHFP 1 1 d3f14587683138e6d10cacba92fa34364ed5d326
SSHFP 1 2 132856925e056d02767e6c6ca4015ed21ac4c6eddb727f7c69e5edecb8806884
sshrsakey => AAAAB3NzaC1yc2EAAAADAQABAAABAQDzcJ6158aIkY161vcDH6WKNgKAeUsxrHh+HJH9IEistcV2TUJSdHtG/p5peI+cTa0EhabbNw8ToUU3ZWYmiTmxxuZzxggAxCx6xhWNDgC/492QnouxHnqjxwpFyIYnLpdbaMRV/6t9iE7v09Gfb31TS3/DbAUh5yla1OOeHbxJQ/eUOUYgy7/6eFL43+R9SfiuP11VRK8r325mCOFaPqw8VuNeGul/rMnccBCbuFvgmQnfOo/ldwrfOL2W4qAvfE0bKyG13WrDSlauo+CFtYqDK08hCItjrbVKgVrOzLCuKGzKFuqOgF3u8Q1je23qu7eUmF7lZPYVWSEpkh0xlR0p
swapfree => 1.45 GB
swapfree_mb => 1482.82
swapsize => 1.46 GB
swapsize_mb => 1497.00
system_uptime => {"seconds"=>6034301, "hours"=>1676, "days"=>69, "uptime"=>"69 days"}
timezone => PDT

I am trying to easily split each fact up into a key/value pair. Using this site:

http://rubular.com/

And this regex

(?m)^(\S+) => (((?!^\S+ => ).)*)$

I am able to get what I want (all the keys and values match perfectly). The problem is I'm writing my code in java, and using this site:

http://java-regex-tester.appspot.com/

With the same inputs I am not getting the matches I want. Specifically the facts where the value of the key/value pair contains a newline character, such as this one:

sshfp_rsa => SSHFP 1 1 d3f14587683138e6d10cacba92fa34364ed5d326
SSHFP 1 2 132856925e056d02767e6c6ca4015ed21ac4c6eddb727f7c69e5edecb8806884

End up omitting the second line of the value:

key = sshfp_rsa
value = SSHFP 1 1 d3f14587683138e6d10cacba92fa34364ed5d326

Can anyone help me building the correct regex?


Solution

  • This regex should work for you:

    (?ms)^(\w+) => (.*?)(?=(?:\s^\w+ =>|\z))
    

    In Java Code:

    Pattern p = Pattern.compile("^(\\w+) => (.*?)(?=(?:\\s^\\w+ =>|\\z))", 
              Pattern.MULTILINE | Pattern.DOTALL);
    

    RegEx Demo