Search code examples
phpxmlxpathdomdocument

How to get a text node without spaces at the edges?


How do I get the first text node back from my parent without spaces at the edges?

Node:

<p>Hello <b>World</b> by.</p>

You must get the first word without spaces at the edges:

→Hello←

Such Xpath query:

p/[normalize-space(text()[1])]

Return an error:

DOMXPath::query(): Invalid expression


Solution

  • DOMXpath:query() does not allow Xpath expressions with scalar results. You will have to use DOMXpath::evaluate().

    $xml = <<<'XML'
    <p>Hello <b>World</b> by.</p>
    XML;
    
    $document = new DOMDocument();
    $document->loadXML($xml);
    $xpath = new DOMXpath($document);
    
    var_dump(
        $xpath->evaluate('normalize-space(p/text()[1])', $document)
    );
    

    Output:

    string(5) "Hello"
    

    The string function has to be on the outside. So that you first select the nodes, then cast the result to string. p selects p children of the context node. //p selects any p element node in the document. So p/text()[1] are all first (position) text child nodes inside p children.

    normalize-space() will cast the first node (from the fetched list) into a string, strip trailing/leading whitespaces and replace all groups of whitespace inside with single spaces.