Search code examples
phpsimplexml

Getting the text portion of a node using php Simple XML


Given the php code:

$xml = <<<EOF
<articles>
<article>
This is a link
<link>Title</link>
with some text following it.
</article>
</articles>
EOF;

function traverse($xml) {
    $result = "";
    foreach($xml->children() as $x) {
        if ($x->count()) {
            $result .= traverse($x);
        }
        else {
            $result .= $x;
        }
    }
    return $result;
}

$parser = new SimpleXMLElement($xml);
traverse($parser);

I expected the function traverse() to return:

This is a link Title with some text following it.

However, it returns only:

Title

Is there a way to get the expected result using simpleXML (obviously for the purpose of consuming the data rather than just returning it as in this simple example)?


Solution

  • There might be ways to achieve what you want using only SimpleXML, but in this case, the simplest way to do it is to use DOM. The good news is if you're already using SimpleXML, you don't have to change anything as DOM and SimpleXML are basically interchangeable:

    // either
    $articles = simplexml_load_string($xml);
    echo dom_import_simplexml($articles)->textContent;
    
    // or
    $dom = new DOMDocument;
    $dom->loadXML($xml);
    echo $dom->documentElement->textContent;
    

    Assuming your task is to iterate over each <article/> and get its content, your code will look like

    $articles = simplexml_load_string($xml);
    foreach ($articles->article as $article)
    {
        $articleText = dom_import_simplexml($article)->textContent;
    }