Search code examples
javaxmlparsinggroovyxmlslurper

XmlSlurper Doesn't Parse the Whole XML


I'm trying to access parts of an XML string after it was processed with XmlSlurper (inside the WS Lite plugin in case that matters). Here's a sample XML string:

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
  <GetBusinessObjectByPublicIdResponse>
    <GetBusinessObjectByPublicIdResult>
      <BusinessObject REF="21cf6434ae" Name="Incident" ID="1518">
        <FieldList>
          <Field REF="f5b2ef7e04" Name="ID">
            93e5346ec110eee46ea095
          </Field>
          [tons more field entries]
        </FieldList>
      </BusinessObject>
    </GetBusinessObjectByPublicIdResult>
  </GetBusinessObjectByPublicIdResponse>
</soap:Body>
</soap:Envelope>

I have the body node and if I try to access just the GetBusinessObjectByPublicIdResponse node or the GetBusinessObjectByPublicIdResult node everything seems to be working fine. However, if I try to go deeper into the XML to get to the BusinessObject node (or deeper) that's when things stop working.

For example, the following code:

def node = body.GetBusinessObjectByPublicIdResponse[0].GetBusinessObjectByPublicIdResult[0]

returns a proper NodeChild object. However, if I do the following:

def node = body.GetBusinessObjectByPublicIdResponse[0].GetBusinessObjectByPublicIdResult[0].BusinessObject[0]

I get a NoChildren object. It seems like the BusinessObject node (and all it's children) are not being parsed. They exist as a string if I do the following:

body.GetBusinessObjectByPublicIdResponse[0].GetBusinessObjectByPublicIdResult[0].text()

but they don't exist as parsed objects.

This is the first time I've had to deal with XMLs in Groovy so I might be doing something wrong, or maybe things are just broken. Any help is highly appreciated.


Solution

  • Turns out the problem was that the XML string (it's a SOAP response) wasn't correctly formatted. All the nodes that weren't getting parsed were actually using

    &lt; and &gt;
    

    instead of < and >. However, XmlSlurper was actually converting the symbols to < and > but just not parsing them. So looking at the XML after XmlSlurper did its thing made everything look correct.