Search code examples
phpxpathdomxpath

XPath - select empty elements that are not part of a list


$list = array('br', 'hr', 'link', 'meta', 'title');

Using DOMXpath, how can I select nodes that are empty and their tagName is not within $list? (I want to add a space in their textContent so they are not automatically closed)


Solution

  • You didn't give us any XML to work with, which is not very nice, but here you go:

    $xml = <<<XML
    <div>
       <a>
       </a>
       <p>some text</p>
       <p></p>
       <span>no text
          <hr/>
          <ul></ul>
       </span>
       <br/>
    </div>
    XML;
    
    $dom = new DOMDocument;
    $dom->loadXML($xml);
    $xpath = new DOMXPath($dom);
    $list = array('br', 'hr', 'link', 'meta', 'title');
    $expr = array();
    foreach ($list as $l) {
       $expr[] = "not(self::$l)";
    }
    $expr = implode(' and ', $expr);
    
    foreach ($xpath->query("//*[$expr and not(normalize-space())]") as $elem) {
       echo "$elem->nodeName\n";
    }
    

    This outputs

    a
    p
    ul
    

    As expected. Now you have the nodes -- it's up to you to add the space. IMO it would be easier to just use not(normalize-space()) and then see if the nodeName is not in your list, but you asked for an XPath expression, so that's what you got.

    Note that normalize-space() is used because pure whitespace may still cause the node to automatically close. If that's not an issue, you can use node() instead.