Working with a Java/XML tutorial and have questions with regards to the getNextSibling() and getFirstChild().
At times I am able to follow along and determine which calls are needed but find myself stumbling when it comes to these 2 calls and the expected results.
Here is the data being used.
<?xml version="1.0" encoding="UTF-8"?>
<AllStorage>
<NorthAmerica>
<EastCoast>
<DeliveryLocations>
<Location>North East </Location>
<Item1>Full</Item1>
<Item2>Empty</Item2>
</DeliveryLocations>
</EastCoast>
</NorthAmerica>
</AllStorage>
The following is the code being used.
Within the code are my comments describing what I am seeing as the code progresses.
There are 3 questions embedded within the code at different locations where I seem to get tripped up.
import java.io.File;
import java.io.IOException;
//from w ww. j a v a 2 s . co m
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class XMLFromjava2
{
public static void main(String[] args) throws IOException, ParserConfigurationException, org.xml.sax.SAXException
{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setIgnoringComments(true);
factory.setCoalescing(true);
factory.setNamespaceAware(false);
factory.setValidating(false);
DocumentBuilder parser = factory.newDocumentBuilder();
Document document = parser.parse(new File("C:\\Downloads\\DummyData2.xml"));
NodeList locations = document.getElementsByTagName("DeliveryLocations"); ///Starting point is the <DeliveryLocations> tag
int numLocations = locations.getLength();
System.out.println("There are " + numLocations + " locations.");
for (int i = 0; i < numLocations; i++) ///Iterates through all DeliveryLocations tag (Only 1 in this example.)
{
Element section = (Element) locations.item(i);
System.out.println("Node : " + section.getNodeName()); ///Prints out the Location Tag.
Node textNode = section.getFirstChild(); ////First Child is the Text Node which is empty (not NULL)
if (textNode.getNodeValue() != null && textNode.getNodeValue().trim().length() > 0) ////QUESTION 1 : Why is the Text Node empty and not null?????
{
System.out.println(" Text Node : " + textNode.getNodeValue()); ///If there was text for the <DeliveryLocations> tag it would appear here.
}
else
System.out.println(" Text Node : <null> or <empty>"); ///This is what is printed
////Going into the LOOP as a Text Node.
while (textNode != null && textNode.getNodeType() != Node.ELEMENT_NODE)
{
System.out.println (" Before getNextSibling() NodeType : " + textNode.getNodeType()); ////Text Node
textNode = textNode.getNextSibling(); /////QUESTION 2 : Why did getNextSibling() work shouldn't it require getNextChild()?????
System.out.println (" After getNextSibling() NodeType : " + textNode.getNodeType()); ///Element Node <Location>
System.out.println(" Text Node : " + textNode.getNodeName());
}
if (textNode != null)
{
System.out.println("Data : " + textNode.getFirstChild().getNodeValue()); /////The firstChild is the Text Node and that value
////is the actual data "North East"
System.out.println("Confirm which Node : " + textNode.getNodeName()); ///This confirms still on Location Node.
System.out.println("2nd print : " + textNode.getNextSibling().getNodeName()); ///QUESTION 3 : Why does this get a Text Node???
///Wouldn't a getChild() get the Text node
///If the Node is Location then shouldn't its Sibling be Item1??
}
}
}
}
From the above questions, it pertains to areas were I am expecting a getSibling and not a getChild and vice versa.
Are you able to help clarify the confusion?
QUESTION 1 : Why is the Text Node empty and not null?????
The text node is not empty, it is blank, i.e. it contains the whitespace characters between the <DeliveryLocations>
and <Location>
tags, which is a line terminator (\r\n
pair?) and 12 space characters.
Your code blindly assumes there is always a text node there. There might not be, so you should always check the node type.
QUESTION 2 : Why did getNextSibling() work shouldn't it require getNextChild()?????
There is no getNextChild()
method. Remember, you're calling getFirstChild()
on the parent node (element), so from that nodes perspective, the call returns a child. You then call getNextSibling()
on a child node, so from that nodes perspective, the call returns a sibling, i.e. another child of the same parent. The method names are consistent with the node they are called on:
getParentNode()
- Walk up to the parent in the hierarchygetFirstChild()
/ getLastChild()
- Walk down to the child in the hierarchygetNextSibling()
/ getPreviousSibling()
- Walk sideways between nodes with the same parentQUESTION 3 : Why does this get a Text Node???
Because there is whitespace between the </Location>
and <Item1>
tags.
The nodes of the DOM-tree for <DeliveryLocations>
are:
┌───────────────────┐ ┌───────────────┐
│ ELEMENT │ →→→ firstChild →→→ │ TEXT NODE │
│ DeliveryLocations │ │ <whitespaces> │
└───────────────────┘ └───────────────┘
↓ ↑ nextSibling ↓ ↑ previousSibling
↓ ↑ ┌───────────┐ ┌───────────────┐
↓ ↑←←← parentNode ←←←←←←←←←←← │ ELEMENT │ →→→ firstChild/lastChild →→→ │ TEXT NODE │
↓ ↑ │ Location │ ←←←←←←←← parentNode ←←←←←←←← │ "North East " │
↓ ↑ └───────────┘ └───────────────┘
↓ ↑ nextSibling ↓ ↑ previousSibling
↓ ↑ ┌───────────────┐
↓ ↑←←← parentNode ←←←←←←←←← │ TEXT NODE │
↓ ↑ │ <whitespaces> │
↓ ↑ └───────────────┘
↓ ↑ nextSibling ↓ ↑ previousSibling
↓ ↑ ┌─────────┐ ┌───────────┐
↓ ↑←←← parentNode ←←←←←←←←←←←← │ ELEMENT │ →→→ firstChild/lastChild →→→ │ TEXT NODE │
↓ ↑ │ Item1 │ ←←←←←←←← parentNode ←←←←←←←← │ "Full" │
↓ ↑ └─────────┘ └───────────┘
↓ ↑ nextSibling ↓ ↑ previousSibling
↓ ↑ ┌───────────────┐
↓ ↑←←← parentNode ←←←←←←←←← │ TEXT NODE │
↓ ↑ │ <whitespaces> │
↓ ↑ └───────────────┘
↓ ↑ nextSibling ↓ ↑ previousSibling
↓ ↑ ┌─────────┐ ┌───────────┐
↓ ↑←←← parentNode ←←←←←←←←←←←← │ ELEMENT │ →→→ firstChild/lastChild →→→ │ TEXT NODE │
↓ ↑ │ Item2 │ ←←←←←←←← parentNode ←←←←←←←← │ "Empty" │
↓ ↑ └─────────┘ └───────────┘
↓ ↑ nextSibling ↓ ↑ previousSibling
↓ ↑ ┌───────────────┐
↓ ↑←←← parentNode ←←←←←←←←← │ TEXT NODE │
↓ │ <whitespaces> │
→→→→ lastChild →→→→→→→→→→→→→→→→→→→→ └───────────────┘