Search code examples
phphtmltinymcehtmlpurifier

Changing tags with PHP depending on content length


I am writing an application that gives users a tinymce HTML editor. The problem that I am facing is that despite how often I ask my users to use "Heading 2" (h2) styles to format their headers, they are either using h1 (which I can deal with!) or they are using a new paragraph, and then bolding the paragraph for the content.

ie

<p><strong>This is a header</strong></p>
<p>Content content blah blah blah.</p>

What I would like to do is find all of the instances of <p><strong> that have say less then eight words in them and replace them with a h2.

What is the best way to do this?

UPDATE: Thanks to Jack's code, I have worked on a simple module that does everything that I described here and more. The code is here on GitHub.


Solution

  • You can use DOMDocument for this. Find the <strong> tag that's a child of <p>, count the number of words and replace node and parent with a <h2>:

    $content = <<<'EOM'
    <p><strong>This is a header</strong></p>
    <p>Content content blah blah blah.</p>
    EOM;
    
    $doc = new DOMDocument;
    $doc->loadHTML($content);
    $xp = new DOMXPath($doc);
    
    
    foreach ($xp->query('//p/strong') as $node) {
            $parent = $node->parentNode;
            if ($parent->textContent == $node->textContent && 
                    str_word_count($node->textContent) <= 8) {
                $header = $doc->createElement('h2', $node->textContent);
                $parent->parentNode->replaceChild($header, $parent);
            }
    }
    
    echo $doc->saveHTML();