Search code examples
phphtmldomdocument

Wrap all HTML tags between h3 tag sets with DOMDocument in PHP


I've got a follow up question to my question that has been answered by Jack: Wrap segments of HTML with divs (and generate table of contents from HTML-tags) with PHP

I've been trying to add some functionality to the answer above, in order to get the following result.

This is my present HTML:

<h3>Subtitle</h3>
<p>This is a paragraph</p>
<p>This is another paragraph</p>
<h3>Another subtile
  <h3>
    <p>Yet another paragraph</p>

This is what I would like to achieve:

<h3 class="current">Subtitle</h3>
<div class="ac_pane" style="display:block;">
  <p>This is a paragraph</p>
  <p>This is another paragraph</p>
</div>
<h3>Another subtitle</h3>
<div class="ac_pane">
  <p>Yet another paragraph</p>
</div>

I've been trying to modify the code out of the example above, but can't figure it out:

foreach ($d->getElementsByTagName('h3') as $h3) {
    $ac_pane_nodes = array($h3);
    for ($next = $h3->nextSibling; $next && $next->nodeName != 'h3'; $next = $next->nextSibling) {
        $ac_pane_nodes[] = $next;
    }
    $ac_pane = $d->createElement('div');
    $ac_pane->setAttribute('class', 'ac_pane');
    // Here I'm trying to wrap all tags between h3-sets, but am failing!
            $h3->parentNode->appendChild($ac_pane, $h3);
    foreach ($ac_pane_nodes as $node) {
        $ac_pane->appendChild($node);
    }
}

Please note that the addition of class="current" to the first h3 set, and the addition of style="display:block;" to the first div.ac_pane is optional, but would be very much appreciated.


Solution

  • As requested, here is a working version. IMO XSLT is still the solution most appropriate to this type of problem (transforming some XML into other XML, really) but I have to admit grouping with regular code is much easier!

    I ended up extending the DOM API slightly just to add a utility insertAfter method on DOMElement. It could have been done without it, but it's neater:

    UPDATED TO WRAP DIV AROUND ALL TAGS AS REQUESTED IN COMMENTS

    <?php
    
    class DOMDocumentExtended extends DOMDocument {
        public function __construct($version = "1.0", $encoding = "UTF-8") {
            parent::__construct($version, $encoding);
            $this->registerNodeClass("DOMElement", "DOMElementExtended");
        }
    }
    
    class DOMElementExtended extends DOMElement {
        public function insertAfter($targetNode) {
            if ($targetNode->nextSibling) {
                $targetNode->parentNode->insertBefore($this, $targetNode->nextSibling);
            } else {
                $targetNode->parentNode->appendChild($this);
            }
        }
    
        public function wrapAround(DOMNodeList $nodeList) {
            while (($node = $nodeList->item(0)) !== NULL) {
                $this->appendChild($node);
            }
        }
    }
    
    $doc = new DOMDocumentExtended();
    $doc->loadHTML(
        "<h3>Subtitle</h3>
        <p>This is a paragraph</p>
        <p>This is another paragraph</p>
        <h3>Another subtile</h3>
        <p>Yet another paragraph</p>"
    );
    
    // Grab a nodelist of all h3 tags
    $nodeList = $doc->getElementsByTagName("h3");
    
    // Iterate over each of these h3 nodes
    foreach ($nodeList as $index => $h3) {
    
        // Special handling for first h3
        if ($index === 0) {
            $h3->setAttribute("class", "current");
        }
    
        // Create a div node that we'll use as our wrapper
        $div = $doc->createElement("div");
        $div->setAttribute("class", "ac_pane");
    
        // Special handling for first div wrapper
        if ($index === 0) {
            $div->setAttribute("style", "display:block;");
        }
    
        // Move next siblings of h3 until we hit another h3
        while ($h3->nextSibling && $h3->nextSibling->localName !== "h3") {
            $div->appendChild($h3->nextSibling);
        }
    
        // Add the div node right after the h3
        $div->insertAfter($h3);
    }
    
    // UPDATE: wrap all child nodes of body in a div
    $div = $doc->createElement("div");
    $body = $doc->getElementsByTagName("body")->item(0);
    $div->wrapAround($body->childNodes);
    $body->appendChild($div);
    
    echo $doc->saveHTML();
    

    Note that loadHTML will add doctype, html and body nodes. They can be stripped out if needed.