I have a wysiwyg on a site. The problem is that the users are copy pasting a lot of data in to it leaving a lot of unclosed and improperly formatted div tags that are breaking the site layout.
Is there an easy an easy way to strip all occurrences of <div>
and </div>
?
str_replace won't work because some of the divs have styling and other things in them so it would need to account for <div style="some styling"> <div align="center">
etc.
I'm guessing this could be done with a regular expression but I am total a total beginner when it comes to those.
No. You do NOT ever parse/manipulate HTML with regexes.
Regexes cannot be bargained with. They can't be reasoned with. They don't understand html, they don't grok xml. And they absolute will NOT stop until your DOM tree is dead.
You use htmlpurifier and/or DOM to manipulate the tree.