Search code examples
xpathnamespacesdom4j

Parsing child node with own namespace in DOM4j with Java


I hope someone can help fix my foobar. I have been working on a DOM4j parser for about a month to extract over 500 data elements from an XML file using XPATH . Unfortunately I used an old test file as a model to create my code and only found the error of my way after plugging in a production file. Here is a small sample of my code. As you can see from the Hashmap, there are several namespaces utilized within the full XML. I cut the code down to only extract 3 elements.

import java.io.File;
import java.util.HashMap;
import java.util.Map;

import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.DocumentHelper;
import org.dom4j.Node;
import org.dom4j.XPath;
import org.dom4j.io.SAXReader;

public class CLOBTest {

public static void main(String[] args) {
      try {
         File inputFile = new File("C:/test.xml");
         //File inputFile = new File("C:/test1.xml");
         SAXReader reader = new SAXReader();
         Document document = reader.read( inputFile );

         Map<String, String> map = new HashMap<String, String>();

         map.put("exch", "http://at.dsh.cms.gov/exchange/1.0");
         map.put("ext", "http://at.dsh.cms.gov/extension/1.0");
         map.put("hix-core", "http://hix.cms.gov/0.1/hix-core");
         map.put("hix-ee", "http://hix.cms.gov/0.1/hix-ee");
         map.put("hix-pm", "http://hix.cms.gov/0.1/hix-pm");
         map.put("nc", "http://niem.gov/niem/niem-core/2.0");
         map.put("niem-core", "http://niem.gov/niem/niem-core/2.0");
         map.put("s", "http://niem.gov/niem/structures/2.0");
         map.put("scr", "http://niem.gov/niem/domains/screening/2.1");
         map.put("xsi", "http://www.w3.org/2001/XMLSchema-instance");


         XPath Request = DocumentHelper.createXPath("//exch:AccountTransferRequest");
         Request.setNamespaceURIs(map);

         Node request =  Request.selectSingleNode(document);

         System.out.println("  ID:        \t" + request.valueOf("ext:TransferHeader/ext:TransferActivity/niem-core:ActivityIdentification/niem-core:IdentificationID")); 
         System.out.println("  First Name:\t" + request.valueOf("hix-core:Person/niem-core:PersonName/niem-core:PersonGivenName")); 
         System.out.println("  Last Name: \t" + request.valueOf("hix-core:Person/niem-core:PersonName/niem-core:PersonSurName")); 

      } catch (DocumentException e) {
         e.printStackTrace();
      }
   }
}

The sample XML file (test.xml) gives the correct result of:

ID:         XXX012345
First Name: gina
Last Name:  davis

test.xml

 <H15>
 <requestMSG>
 <exch:AccountTransferRequest xmlns:exch="http://at.dsh.cms.gov/exchange/1.0" xmlns:hix-core="http://hix.cms.gov/0.1/hix-core" xmlns:niem-core="http://niem.gov/niem/niem-core/2.0" xmlns:s="http://niem.gov/niem/structures/2.0" xmlns:ext="http://at.dsh.cms.gov/extension/1.0" ext:atVersionText="2.3">
 <ext:TransferHeader>
 <ext:TransferActivity>
 <niem-core:ActivityIdentification xmlns:niem-core="http://niem.gov/niem/niem-core/2.0">
 <niem-core:IdentificationID>XXX012345</niem-core:IdentificationID>
 </niem-core:ActivityIdentification>
 </ext:TransferActivity>
 </ext:TransferHeader>
 <hix-core:Person xmlns:hix-core="http://hix.cms.gov/0.1/hix-core" xmlns:s="http://niem.gov/niem/structures/2.0" s:id="Mom">
 <niem-core:PersonName xmlns:niem-core="http://niem.gov/niem/niem-core/2.0">
 <niem-core:PersonGivenName>gina</niem-core:PersonGivenName>
 <niem-core:PersonSurName>davis</niem-core:PersonSurName>
 </niem-core:PersonName>
 </hix-core:Person>
 </exch:AccountTransferRequest>
 </requestMSG>
 </H15>

However, if the element exch:AccountTransferRequest does NOT contain all the namespace references, I get an Unbounded prefix error on the child nodes. I had assumed the Hashmap assignment to the Request XPath had taken care of all the prefix binding. I realized after trying it on a production file (test1.xml) that did not have the full complement of URIs in the exch:AccountTransferRequest element that I was wrong.

test1.xml

 <H15>
 <requestMSG>
 <exch:AccountTransferRequest xmlns:exch="http://at.dsh.cms.gov/exchange/1.0" xmlns:ext="http://at.dsh.cms.gov/extension/1.0" ext:atVersionText="2.3">
 <ext:TransferHeader>
 <ext:TransferActivity>
 <niem-core:ActivityIdentification xmlns:niem-core="http://niem.gov/niem/niem-core/2.0">
 <niem-core:IdentificationID>XXX012345</niem-core:IdentificationID>
 </niem-core:ActivityIdentification>
 </ext:TransferActivity>
 </ext:TransferHeader>
 <hix-core:Person xmlns:hix-core="http://hix.cms.gov/0.1/hix-core" xmlns:s="http://niem.gov/niem/structures/2.0" s:id="Mom">
 <niem-core:PersonName xmlns:niem-core="http://niem.gov/niem/niem-core/2.0">
 <niem-core:PersonGivenName>gina</niem-core:PersonGivenName>
 <niem-core:PersonSurName>davis</niem-core:PersonSurName>
 </niem-core:PersonName>
 </hix-core:Person>
 </exch:AccountTransferRequest>
 </requestMSG>
 </H15>

test1.xml result:

Exception in thread "main" org.dom4j.XPathException: Exception occurred evaluting XPath: ext:TransferHeader/ext:TransferActivity/niem-core:ActivityIdentification/niem-core:IdentificationID. Exception: XPath expression uses unbound namespace prefix niem-core
    at org.dom4j.xpath.DefaultXPath.handleJaxenException(DefaultXPath.java:374)
    at org.dom4j.xpath.DefaultXPath.valueOf(DefaultXPath.java:185)
    at org.dom4j.tree.AbstractNode.valueOf(AbstractNode.java:191)
    at CLOBTest.main(CLOBTest.java:41)

Now, how do I extract the values of the child nodes that have their own namespace? Is there a way to do it while still going through the request node? I would like to salvage some of my effort if possible.


Solution

  • Ok, figured it out. Just had to cast the request node to an element and add all the namespaces. Once I used the element in an output, the entire document recognized them.

             Element test = (Element) request;
             test.addNamespace("exch", "http://at.dsh.cms.gov/exchange/1.0");
             test.addNamespace("ext", "http://at.dsh.cms.gov/extension/1.0");
             test.addNamespace("hix-core", "http://hix.cms.gov/0.1/hix-core");
             test.addNamespace("hix-ee", "http://hix.cms.gov/0.1/hix-ee");
             test.addNamespace("hix-pm", "http://hix.cms.gov/0.1/hix-pm");
             test.addNamespace("nc", "http://niem.gov/niem/niem-core/2.0");
             test.addNamespace("niem-core", "http://niem.gov/niem/niem-core/2.0");
             test.addNamespace("s", "http://niem.gov/niem/structures/2.0");
             test.addNamespace("scr", "http://niem.gov/niem/domains/screening/2.1");
             test.addNamespace("xsi", "http://www.w3.org/2001/XMLSchema-instance");
    
             System.out.println("  ID:                                             \t"+test.valueOf("ext:TransferHeader/ext:TransferActivity/niem-core:ActivityIdentification/niem-core:IdentificationID"));