I'm doing a script which gets a xml file and show some text in it. A sample xml structure could be like:
<documento fecha_actualizacion="20221027071750">
<metadatos>
[...]
</metadatos>
<analisis>
[...]
</analisis>
<texto>
<dl>
<dt>1. Poder adjudicador: </dt>
<dd>
[...]
</dd>
</dl>
</texto>
</documento>
I'm trying to get the html inside 'texto' element as a string
('<dl><dt>1. Poder ad[...]</dt></dd>[...]'
)
, but when getting it, it is shown as:
Array ( [0] => SimpleXMLElement Object ( [dl] => SimpleXMLElement Object ( [dt] => Array ( [0] => 1. Poder adjudicador: [1] => 2. Tip
ordered by element (dl, dt, dd, etc). I've tried every posible solution for querying that 'texto' element (with '//texto/text()', innerhtml, node(), nodeValue(), etc.) but it always return me the same.
How could I get something like '<dl><dt>1. Poder ad[...]</dt></dd>[...]'
Thank you!!
I have tried with selectors:
$texto = $xml->xpath('//texto/text()');
$texto = $xml->xpath('//texto/innerXml()');
$texto = $xml->xpath('//texto/node()');
$texto = $xml->xpath('//texto/nodevalue()');
You need to fetch the parent nodes (texto
), iterate and save each child node as XML:
$documento = new SimpleXMLElement(getXMLstring());
foreach ($documento->xpath('//texto') as $texto) {
$result = '';
foreach ($texto->children() as $content) {
$result .= $content->asXML();
}
var_dump($result);
}
Output:
string(59) "<dl>
<dt>1. Poder adjudicador: </dt>
<dd>
[...]
</dd>
</dl>"
SimpleXML is an abstraction focused on element nodes. It has limits. If the texto
element can have non-element child nodes they will not be included. In this case you need to use DOM.
$document = new DOMDocument();
$document->loadXML(getXMLString());
$xpath = new DOMXpath($document);
foreach ($xpath->evaluate('//texto') as $texto) {
$result = '';
foreach ($texto->childNodes as $content) {
$result .= $document->saveXML($content);
}
var_dump($result);
}
Additionally DOMXpath::evaluate()
supports full Xpath 1.0, including expressions that return scalar values.