I'm using DOMDocument and XPath.
Given to following XML
<Description>
<CompleteText>
<DetailTxt>
<Text>
<span>Here there is some text</span>
<h2>And maybe a headline</h2>
<br/>
<span>Normal position</span>
<br/>
<span> </span>
<br/>
</Text>
</DetailTxt>
</CompleteText>
</Description>
The node /Description/CompleteText/DetailTxt/Text
contains markup, unfortunately unescaped, but I can't change that. Is there any chance I can query that content maintaining the html markup?
Obviously, nodeValue but also textContent. Both giving me the content omitting markup.
You can use the saveHTML
method of DOMDocument
to serialize a node as HTML, in your case you seem to want to call it on each child node of the selected node and concatenate the strings; in the browser DOM APIs that would be called innerHTML
so I have written a function of that name doing that and also used the ability to call PHP functions from XPath in the following snippet:
<?php
$xml = <<<'EOD'
<Description>
<CompleteText>
<DetailTxt>
<Text>
<span>Here there is some text</span>
<h2>And maybe a headline</h2>
<br/>
<span>Normal position</span>
<br/>
<span> </span>
<br/>
</Text>
</DetailTxt>
</CompleteText>
</Description>
EOD;
$doc = new DOMDocument();
$doc->loadXML($xml);
$xpath = new DOMXPath($doc);
function innerHTML($nodeList) {
$node = $nodeList[0];
$html = '';
$containingDoc = $node->ownerDocument;
foreach ($node->childNodes as $child) {
$html .= $containingDoc->saveHTML($child);
}
return $html;
}
$xpath->registerNamespace("php", "http://php.net/xpath");
$xpath->registerPHPFunctions("innerHTML");
$innerHTML = $xpath->evaluate('php:function("innerHTML", /Description/CompleteText/DetailTxt/Text)');
echo $innerHTML;
Output as http://sandbox.onlinephpfunctions.com/code/62a980e2d2a2485c2648e16fc647a6bd6ff5620b is
<span>Here there is some text</span>
<h2>And maybe a headline</h2>
<br>
<span>Normal position</span>
<br>
<span> </span>
<br>