I'm trying to parse a .dita
file, but there is a node inside another node, and while that isn't weird, there is actually text surrounding the inner node, it looks a bit like this:
<node>
Hello this is a <xlink src="example.com">LINK</xlink> that you may click
</node>
I can get the text from node
and i can get all instances of xlink
, yet the text from the node
will look like this:
Hello this is a that you may click
As you can see, the word LINK
is missing, and even though i can call the xlink
node and get an array containing the word LINK
, it hasn't thus far been possible to place the words back, as their position is unknown.
I'll have to add that checking for 2 spaces wouldn't work, as there can also be 2 spaces in the original text, and thus the position of the words won't be correct.
The DOMElement::$textContent
contains the text content of all descendant nodes.
If you fetch values via Xpath expression you can use the string()
function to cast the first node into a string - returning its text content.
$xml = <<<'XML'
<node>
Hello this is a <xlink src="example.com">LINK</xlink> that you may click
</node>
XML;
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
// access the text conent of the node element
var_dump($document->documentElement->textContent);
// use Xpath string() function
var_dump($xpath->evaluate('string(self::node)', $document->documentElement));
Output:
string(45) "
Hello this is a LINK that you may click
"
string(45) "
Hello this is a LINK that you may click
"