Search code examples
rubyxmlnokogiritextmate

Parse TextMate snippet with Nokogiri


A TextMate snippet (.tmSnippet) usually looks something like this, whereas some key/string-pairs are optional and can be at any position.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>content</key>
        <string>${1:the actual snippet}</string>
        <key>tabTrigger</key>
        <string>my_trigger</string>
        <key>name</key>
        <string>This is my Snippet's name</string>
        <key>scope</key>
        <string>source.js</string>
        <key>uuid</key>
        <string>6C2985F1-9BB8-43D7-A85C-1006B2932A0D</string>
</dict>
</plist>

I'm trying to parse this using Nokogiri, but since the tags are all <key> and <string> and the position of each key/string-pair can alter, I'm not sure how to do that. I'm after the scope, the tabTrigger, content and name.


Solution

  • Assuming the sub-nodes of a dict node are just key-string pairs, this:

    require 'nokogiri'
    
    kws = %w{ scope tabTrigger content name }
    
    doc = Nokogiri::XML(File.read('a.tmsnippet'))
    
    doc.xpath('//dict').each do | dict_node |
      dict_node.element_children.map(&:content).each_slice(2) do | k, v |
        next unless kws.include? k
        puts "#{k} -> #{v}"
      end
    end
    

    produces

    "content -> ${1:the actual snippet}
    tabTrigger -> my_trigger
    name -> This is my Snippet's name
    scope -> source.js"
    

    Otherwise you need some more logic on the node types before looking at their content.