I am writing an application that gives users a tinymce HTML editor. The problem that I am facing is that despite how often I ask my users to use "Heading 2" (h2) styles to format their headers, they are either using h1 (which I can deal with!) or they are using a new paragraph, and then bolding the paragraph for the content.
ie
<p><strong>This is a header</strong></p>
<p>Content content blah blah blah.</p>
What I would like to do is find all of the instances of <p><strong>
that have say less then eight words in them and replace them with a h2.
What is the best way to do this?
UPDATE: Thanks to Jack's code, I have worked on a simple module that does everything that I described here and more. The code is here on GitHub.
You can use DOMDocument
for this. Find the <strong>
tag that's a child of <p>
, count the number of words and replace node and parent with a <h2
>:
$content = <<<'EOM'
<p><strong>This is a header</strong></p>
<p>Content content blah blah blah.</p>
EOM;
$doc = new DOMDocument;
$doc->loadHTML($content);
$xp = new DOMXPath($doc);
foreach ($xp->query('//p/strong') as $node) {
$parent = $node->parentNode;
if ($parent->textContent == $node->textContent &&
str_word_count($node->textContent) <= 8) {
$header = $doc->createElement('h2', $node->textContent);
$parent->parentNode->replaceChild($header, $parent);
}
}
echo $doc->saveHTML();