Search code examples
javajaxbcxfunmarshallingwsdl2java

What is the simplest way to extract an XML node for JAXB.unmarshal()?


I use the wsdl2java goal of cxf-codegen-plugin to generate Java from a WSDL. Then, in my tests, I use JAXB.unmarshal() to populate classes from a raw webservice XML result.

A typical example is GetAllResponseType response = unmarshal("get-all.xml", GetAllResponseType.class), using the following method:

<T> T unmarshal(String filename, Class<T> clazz) throws Exception {
    InputStream body = getClass().getResourceAsStream(filename);
    return javax.xml.bind.JAXB.unmarshal(body, clazz);
}

The problem is this: The raw XML response always have enclosing Envelope and Body tags which are not generated as classes by wsdl2java:

<n4:Envelope xmlns:http="http://schemas.xmlsoap.org/wsdl/http/" xmlns:n="http://www.informatica.com/wsdl/"
         xmlns:n4="http://schemas.xmlsoap.org/soap/envelope/" xmlns:n5="http://schemas.xmlsoap.org/wsdl/"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <n4:Body>
    <n:getAllResponse xmlns:n="http://www.informatica.com/wsdl/">
        <n:getAllResponseElement>
           ...
        </n:getAllResponseElement>
    </n:getAllResponse>
  </n4:Body>
</n4:Envelope>

So, in order to use JAXB.unmarshal() I have to

  1. either strip away the surrounding Envelope/Body tags manually in get-all.xml
  2. or extract the getAllResponse node and re-convert it to an InputStream
  3. or create the Envelope and Body classes

Currently I do 2, but it's a lot of code:

<T> T unmarshal(String filename, Class<T> clazz) throws Exception {
    InputStream is = getClass().getResourceAsStream(filename);
    InputStream body = nodeContent(is, "n4:Body");
    return javax.xml.bind.JAXB.unmarshal(body, clazz);
}

InputStream nodeContent(InputStream is, String name) throws Exception {
    DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
    Document doc = docBuilder.parse(is);
    Node node = firstNonTextNode(doc.getElementsByTagName(name).item(0).getChildNodes());
    return nodeToStream(node);
}

Node firstNonTextNode(NodeList nl) {
    for (int i = 0; i < nl.getLength(); i++) {
        if (!(nl.item(i) instanceof Text)) {
            return nl.item(i);
        }
    }
    throw new RuntimeException("Couldn't find nontext node");
}

InputStream nodeToStream(Node node) throws Exception {
    ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
    Source xmlSource = new DOMSource(node);
    Result outputTarget = new StreamResult(outputStream);
    TransformerFactory.newInstance().newTransformer().transform(xmlSource, outputTarget);
    return new ByteArrayInputStream(outputStream.toByteArray());
}

My questions are:

  • Is there an easier way to the extraction in 2? I am tempted to just do a regexp. I tried XPath, but somehow I couldn't get it to work. Code examples would be helpful.
  • Can I get wsdl2java to create the Body / Envelope classes (3), or is it easy to create them myself?

Solution

  • The node within the n4:Body node can be unmarshalled by utilizing XMLStreamReader and the "raw" JAXB Unmarshaller:

    <T> T unmarshal(String filename, Class<T> clazz) throws Exception {
        XMLInputFactory xif = XMLInputFactory.newFactory();
        XMLStreamReader xsr = xif.createXMLStreamReader(getClass().getResourceAsStream(filename));
        xsr.nextTag();
        while (!xsr.getLocalName().equals("Body")) {
            xsr.nextTag();
        }
        xsr.nextTag();
        Unmarshaller unmarshaller = JAXBContext.newInstance(clazz).createUnmarshaller();
        return unmarshaller.unmarshal(xsr, clazz).getValue();
    }
    

    Thanks to Blaise Doughan for helping on this answer.