Search code examples
phpdomdocument

How can specific sequentially elements be wrapped with a single container using DOMDocument?


Let's say I've got the following markup:

<h3 class="handorgel__header">
    <button class="handorgel__header__button">
        Books, Literature and Languages
    </button>
</h3>

<div class="handorgel__content">
    <div class="handorgel__content__inner">
        <p>Hello</p>
    </div>
</div>

<h3 class="handorgel__header">
    <button class="handorgel__header__button">
        Business and Consumer Information
    </button>
</h3>

<div class="handorgel__content">
    <div class="handorgel__content__inner">
        <p>World</p>
    </div>
</div>

<p>Some text in between</p>

<h3 class="handorgel__header">
    <button class="handorgel__header__button">
        Fine Arts and Music
    </button>
</h3>

<div class="handorgel__content">
    <div class="handorgel__content__inner">
        <p>Hello</p>
    </div>
</div>

<h3 class="handorgel__header">
    <button class="handorgel__header__button">
        Genealogy
    </button>
</h3>

<div class="handorgel__content">
    <div class="handorgel__content__inner">
        <p>World</p>
    </div>
</div>

I want to wrap each group of .handorgel__* elements so that they are contained with a container <div class="handorgel">.

<div class="handorgel">

    <h3 class="handorgel__header">
        <button class="handorgel__header__button">
            Books, Literature and Languages
        </button>
    </h3>

    <div class="handorgel__content">
        <div class="handorgel__content__inner">
            <p>Hello</p>
        </div>
    </div>

    <h3 class="handorgel__header">
        <button class="handorgel__header__button">
            Business and Consumer Information
        </button>
    </h3>

    <div class="handorgel__content">
        <div class="handorgel__content__inner">
            <p>World</p>
        </div>
    </div>

</div>

<p>Some text in between</p>

<div class="handorgel">

    <h3 class="handorgel__header">
        <button class="handorgel__header__button">
            Fine Arts and Music
        </button>
    </h3>

    <div class="handorgel__content">
        <div class="handorgel__content__inner">
            <p>Hello</p>
        </div>
    </div>

    <h3 class="handorgel__header">
        <button class="handorgel__header__button">
            Genealogy
        </button>
    </h3>

    <div class="handorgel__content">
        <div class="handorgel__content__inner">
            <p>World</p>
        </div>
    </div>

</div>

There could be any number of elements within each group, and any number of groups of a page. How can I detect these groups and wrap them appropriately? I currently use DOMDocument for a number of things on this project, so if possible, I'd like to use that for this purpose as well, unless there's a clearly superior method.


Solution

  • Managed to get this working myself, after a bunch of trial-and-error. Wasn't quite as difficult as I assumed it would be, DOMDocument actually takes care of some of the removal logic itself.

    
    /**
     * Wrap handorgel groups in appropriate containers
     *
     * @param string $content
     *
     * @return string
     */
    function gpl_wrap_handorgel_shortcodes(string $content): string {
        if (! is_admin() && $content) {
            $DOM = new DOMDocument();
    
            // disable errors to get around HTML5 warnings...
            libxml_use_internal_errors(true);
    
            // load in content
            $DOM->loadHTML(mb_convert_encoding("<html><body>{$content}</body></html>", "HTML-ENTITIES", "UTF-8"), LIBXML_HTML_NODEFDTD);
    
            // reset errors to get around HTML5 warnings...
            libxml_clear_errors();
    
            $body = $DOM->getElementsByTagName("body");
    
            $handorgels = [];
    
            $prev_class = "";
    
            foreach ($body[0]->childNodes as $element) {
                /**
                 * Ensure that only HTML nodes get checked/modified
                 */
                if ($element->nodeType == 1) {
                    $current_class = $element->getAttribute("class");
    
                    /**
                     * Find any handorgel elements
                     */
                    if (preg_match("/handorgel__/", $current_class)) {
                        $group = array_key_last($handorgels);
    
                        /**
                         * If the previous class didn't include `handorgel__`, create a new handorgel object
                         */
                        if (! preg_match("/handorgel__/", $prev_class)) {
                            $handorgels[] = [
                                "container" => $DOM->createElement("div"),
                                "elements"  => [],
                            ];
    
                            /**
                             * Update `$group` to match the new container
                             */
                            $group = array_key_last($handorgels);
                        }
    
                        /**
                         * Append the current element to the group to be moved after all sequential handorgel
                         * elements are located for its group
                         */
                        $handorgels[$group]["elements"][] = $element;
                    }
    
                    /**
                     * Update `$prev_class` to track where handorgel groups should begin and end
                     */
                    $prev_class = $current_class;
                }
            }
    
            /**
             * Construct the grouped handorgels
             */
            if ($handorgels) {
                foreach ($handorgels as $group => $handorgel) {
                    $handorgel["container"]->setAttribute("class", "handorgel");
    
                    foreach ($handorgel["elements"] as $key => $element) {
                        /**
                         * Insert the container in the starting position for the group
                         */
                        if ($key === 0) {
                            $element->parentNode->insertBefore($handorgels[$group]["container"], $element);
                        }
    
                        $handorgel["container"]->appendChild($element);
                    }
                }
            }
    
            /**
             * Remove unneeded tags (inserted for parsing reasons)
             */
            $content = remove_extra_tags($DOM); // custom function that removes html/body and outputs the DOMDocument as a string
        }
    
        return $content;
    }
    add_filter("the_content", "gpl_wrap_handorgel_shortcodes", 30, 1);