Search code examples
rubynokogirixfdf

Is there a Nokogiri example code for parsing Acrobat XFDF?


I am looking for a ruby code snippet that shows use of Nokogiri to parse Acrobat XFDF data.


Solution

  • It's no different than parsing any other XML:

    require 'nokogiri'
    
    xfdf = '<?xml version="1.0" encoding="UTF-8"?>
    <xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
      <f href="Demo PDF Form.pdf"/>
      <fields>
        <field name="Date of Birth">
          <value>01-01-1960</value>
        </field>
        <field name="Your Name">
          <value>Mr. Customer</value>
        </field>
      </fields>
      <ids original="FEBDB19E0CD32274C16CE13DCF244AD2" modified="5BE74DD4F607B7409DC03D600E466E12"/>
    </xfdf>
    '
    
    doc = Nokogiri::XML(xfdf)
    doc.at('//xmlns:f')['href'] # => "Demo PDF Form.pdf"
    doc.at('//xmlns:field[@name="Date of Birth"]').text # => "\n      01-01-1960\n    "
    doc.at('//xmlns:field[@name="Your Name"]').text # => "\n      Mr. Customer\n    "
    

    It uses a XML namespace, so you have to honor that in the xpaths, or deal with it by telling Nokogiri to ignore them, but this is common in XML.