I have a question about XML parsing. I was experimenting with a sample program and changed it up a bit to try to understand how parsing works however, I've encountered an output I dont quite understand and hope that some of you can shed some light onto what may be going on.
This is my xml file:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<root xmlns="http://www.test.com">
<ApplicationSettings>
<option_a>"10"</option_a>
<option_b>"24"</option_b>
</ApplicationSettings>
</root>
I inserted debug statements throughout my program to try to understand what goes on when function calls such as getChildNodes() processes as it is called. This is the output I received:
Parsing xml file...
Processing Root...
Processing children with getChildNodes()...
>>>>>>>>>>> Loop child 0: Node name is: #text
>>>>>>>>>>> Loop child 1: Node name is: ApplicationSettings
= ApplicationSettings processing children with getChildNodes()...
***** iter 0 child name is #text
***** iter 1 child name is option_a
***** iter 2 child name is #text
***** iter 3 child name is option_b
***** iter 4 child name is #text
>>>>>>>>>>> Loop: 2 Node name is: #text
From the output, I can easily infer it correctly parsed my xml file. However, I noticed the program also detected extra nodes with the name #text (printed out using the getNodeName() function). My question is, what do those #text refer to and why do they appear periodically throughout the loops?
Thanks!
Those #text
nodes in your example refer to the whitespace between tags. For example here
<root xmlns="http://www.test.com">
<ApplicationSettings>
there are a line feed and four spaces between ...com">
and <App...
.
You can try to parse the following to see what happens:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<root xmlns="http://www.test.com"><ApplicationSettings><option_a>"10"</option_a><option_b>"24"</option_b></ApplicationSettings></root>