This is something I have been wrestling with for a couple of hours now, although in different non working forms.
I have a simple XML which I import through File > Import XML
<?xml version="1.0" encoding="UTF-8"?>
<doc>
<page>
<p id="1">
hello <test /> alien world <br /> from space
</p>
<p id="2">
hello <test /> another alien world <br /> from space
</p>
</page>
</doc>
For now my goal is rather simple namely iterate over all elements. This works until i'm at the point that I want to iterate over the (mixed) content within my p tags.
// check version
if (parseInt (app.version) > 4 && app.documents.length > 0){
main();
}
function main() {
// get active document
var doc = app.activeDocument;
// get xml that has been imported
var xmlDocument = doc.xmlElements[0];
// get page elements
const page_elements = xmlDocument.evaluateXPathExpression('/doc/page')
// iterate over page elements
for(var x = 0; x < page_elements.length; x++) {
// within each page element look for p elements
const p_elements = page_elements[x].evaluateXPathExpression('./p');
for(var y = 0; y < p_elements.length; y++) {
var child_elements = p_elements[y].evaluateXPathExpression('./node()');
for (var z = 0; z < child_elements.length; z++) {
$.writeln(child_elements[z].markupTag.name);
$.writeln(child_elements[z].contents);
}
}
}
}
The problem is that doesn't seem to keep in mind the order of the child items.
The output - according to the Extended Toolkit - is "p hello alien world from space test br" while it should be "hello test alien world br from space".
I'm basing myself on the documentation that is available here (http://jongware.mit.edu/idcs4js/index_XML%20Suite.html) including the use of XmlElements or XmlItems but with no prevail :-(
A posible alternative that I'm now thinking at is to read the xml file through javascript, create a XML object from that string and go further from here.
Your output is correct, it is just not what you expected.
The contents of the p tag is all text in this tag, so you get
1 tag name - p
2 tag contents = hello alien world from space
3 name of the next tag in the collection -> test
4 name of the next tag in the collection -> br
Both test and br nodes do not have any contents - they are empty nodes.
You can try to insert <test> some text</text>
to see how it works. The result should be
p
hello alien world from space
test
some text
br