Search code examples
rubynokogirisavon

How to parse attribute from savon response


I am posting the soap response I am working with at the bottom.
I need to grab the BodyType="HTML" attribute from <t:Body BodyType="HTML">

Doing response.body turns the entire thing into a hash and there is no sign of BodyType="HTML" in that.

Doing response.doc.css("t|Body") generates the error: Undefined namespace prefix: //t:Body (Nokogiri::XML::XPath::SyntaxError) because I don't see that namespace declaration in the XML.

Doing response.doc.css("Body") return blank.

What can I do to retrieve the value of BodyType?

Since there is no point in posting the code that makes the secure/private soap request, I am posting some basic code that reads in the XML from a flat file:

require 'savon'
require 'active_support/core_ext/hash/conversions'
require 'nokogiri'

@doc = Nokogiri::XML(File.open("tmp.xml"))
puts @doc.css("t|Body")

And here's the XML:

<?xml version="1.0" encoding="utf-8"?>
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
  <s:Header>
    <h:ServerVersionInfo xmlns:h="http://schemas.microsoft.com/exchange/services/2006/types" xmlns="http://schemas.microsoft.com/exchange/services/2006/types" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" MajorVersion="15" MinorVersion="1" MajorBuildNumber="629" MinorBuildNumber="8" Version="V2016_07_13"/>
  </s:Header>
  <s:Body>
    <m:GetItemResponse xmlns:m="http://schemas.microsoft.com/exchange/services/2006/messages" xmlns:t="http://schemas.microsoft.com/exchange/services/2006/types">
      <m:ResponseMessages>
        <m:GetItemResponseMessage ResponseClass="Success">
          <m:ResponseCode>NoError</m:ResponseCode>
          <m:Items>
            <t:Message>
              <t:ItemId Id="AAMkADE2NjQyMjVlLWNhY2UtNDNiMS04MzgxLWZiNzEyNzA0NDgwNQBGAAAAAACLt5QBAQ/GRYv+vEXkY5vLBwA6ksGFFTICTbjFW6e9FfRGAAAAAAEMAAA6ksGFFTICTbjFW6e9FfRGAAAu8FruAAA=" ChangeKey="CQAAABYAAAA6ksGFFTICTbjFW6e9FfRGAAAu9iR3"/>
              <t:ParentFolderId Id="AAMkADE2NjQyMjVlLWNhY2UtNDNiMS04MzgxLWZiNzEyNzA0NDgwNQAuAAAAAACLt5QBAQ/GRYv+vEXkY5vLAQA6ksGFFTICTbjFW6e9FfRGAAAAAAEMAAA=" ChangeKey="AQAAAA=="/>
              <t:ItemClass>IPM.Note</t:ItemClass>
              <t:Subject>From test</t:Subject>
              <t:Sensitivity>Normal</t:Sensitivity>
              <t:Body BodyType="HTML">Hello world</t:Body>
            </t:Message>
          </m:Items>
        </m:GetItemResponseMessage>
      </m:ResponseMessages>
    </m:GetItemResponse>
  </s:Body>
</s:Envelope>

Solution

  • Namespaces can really muddy the waters.

    By default, Nokogiri will look in the root node for namespace declarations so t|Body would work if xmlns:t had been defined in the root node.

    But, because it wasn't, you have to use collect_namespaces to tell Nokogiri to search the document and build a hash of all the ones it found. Then you can pass that hash to search, css, at or any of the search methods:

    require 'nokogiri'
    
    doc = Nokogiri::XML(<<EOT)
    <?xml version="1.0" encoding="utf-8"?>
    <s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
      <s:Body>
        <m:GetItemResponse xmlns:m="http://schemas.microsoft.com/exchange/services/2006/messages" xmlns:t="http://schemas.microsoft.com/exchange/services/2006/types">
          <t:Message>
            <t:Body BodyType="HTML">Hello world</t:Body>
          </t:Message>
        </m:GetItemResponse>
      </s:Body>
    </s:Envelope>
    EOT
    ns = doc.collect_namespaces # => {"xmlns:s"=>"http://schemas.xmlsoap.org/soap/envelope/", "xmlns:t"=>"http://schemas.microsoft.com/exchange/services/2006/types", "xmlns:m"=>"http://schemas.microsoft.com/exchange/services/2006/messages"}
    doc.at("t|Body", ns)['BodyType'] # => "HTML"
    

    If you read the documentation for collect_namespaces you'll see that there's a potential problem where the keys returned could overwrite previously found declarations. If there were such a problem you could work around that by finding the s:Body node, then its first child-element then collecting the namespaces:

    ns = doc.at('s|Body').first_element_child.namespaces 
    # => {"xmlns:m"=>"http://schemas.microsoft.com/exchange/services/2006/messages", "xmlns:t"=>"http://schemas.microsoft.com/exchange/services/2006/types", "xmlns:s"=>"http://schemas.xmlsoap.org/soap/envelope/"}
    

    That will result in a hash of only the namespaces inside s:Body: