I am trying to parse a XML document using Xerces, but I cant seem to access the data within the elements, below is a sample XML document;
<sample>
<block>
<name>tom</name>
<age>44</age>
<car>BMW</car>
</block>
<block>
<name>Jenny</name>
<age>23</age>
<car>Ford</car>
</block>
</sample>
SO far the only output I can produce is;
Sample
block
name
age
car
block
name
age
car
Which is just a list of the node names. I have tried node.getValue(), but this just returns null, so im guessing thats wrong!
How can I access the data inside? Here is what is the basics so far;
public static void display(String file) {
try{
DOMParser parser = new DOMParser();
parser.parse(file);
Document doc = parser.getDocument();
read(doc);
}
catch(Exception e){e.printStackTrace(System.err);}
}
public static void read(Node node) {
if(node == null) {return;}
int type = node.getNodeType();
//System.out.print((node));
switch (type) {
case Node.DOCUMENT_NODE: {
display_all(((Document)node).getDocumentElement());
break;
}
case Node.TEXT_NODE:
break;
case Node.ELEMENT_NODE: {
System.out.println(node.getNodeName());
NodeList child = node.getChildNodes();
if(child != null) {
int length = child.getLength();
for (int i = 0; i < length ; i++) {
display_all(child.item(i));
}
}
break;
}
}
}
getNodeValue()
returns the value of a text node, which you currently skip over.
public static void read(Node node) {
if (node == null) {
return;
}
int type = node.getNodeType();
switch (type) {
case Node.DOCUMENT_NODE: {
System.out.println("Doc node; name: " + node.getNodeName());
read(((Document) node).getDocumentElement());
break;
}
case Node.TEXT_NODE:
System.out.println("Text node; value: " + node.getNodeValue().replaceAll("\\s", ""));
break;
case Node.ELEMENT_NODE: {
System.out.println("Element node; name: " + node.getNodeName());
NodeList children = node.getChildNodes();
int length = children.getLength();
for (int i = 0; i < length; i++) {
read(children.item(i));
}
break;
}
}
}
I think where you might be getting confused is how XML is actually structured, and what the children of something like this is:
<element>
<child_element>foo</child_element>
</element>
The above code snippet may help explain.
It's also why things like dom4j, JAXB, XPath, etc. make things much easier.