Search code examples
phpxpathdomxpath

PHP Xpath to remove <br />?


I am using xpath to remove <br /> from nodes with this code below,

$nodeList = $xpath->query("//p[node()[1][self::br]]/br[1] | //p[node()[last()][self::br]]/br[last()] | //*[node()[last()][self::br]]/br[last()]");
foreach($nodeList as $node) 
{
   $node->parentNode->removeChild($node);
}

So it removes <p>Text<strong><br /></strong></p> to <p>Text</p> which is perfect.

But I don't want it to remove <br /> from <p>Text<strong>Bold<br /></strong>Break</p> because the are text after <br />.

How can I fix this?


Solution

  • If the node value of the <br>'s parent element node is an empty string, you want to remove it. That probably matches your needs better:

    //br[string(..) = '']
    

    A code example that shows which one is matching those (adds a remove attribute):

    <?php
    
    $xml = simplexml_load_string('
    <root>
       <p>Text<strong><br /></strong></p>
       <p>Text<strong>Bold<br /></strong></p>
    </root>
    ');
    
    foreach($xml->xpath('//br[string(..) = ""]') as $br) {
        $br['remove'] = 'remove';
    }
    
    echo $xml->asXML();
    

    Output (Demo):

    <?xml version="1.0"?>
    <root>
       <p>Text<strong><br remove="remove"/></strong></p>
       <p>Text<strong>Bold<br/></strong></p>
    </root>