PHP HTML strip_tags all except some and remove styling from within tag

The HTML looks like this:

$html = 'SOME TEXT<p style="border-top: 0.0px;border-right: 0.0px;vertical-align: baseline;border-bottom: 0.0px;color: #000000;padding-bottom: 0.0px;padding-top: 0.0px;padding-left: 0.0px;margin: 0.0px;border-left: 0.0px;padding-right: 0.0px;background-color: #ffffff;">SOME TEXT';

I tried strip_tags($html, ''); to remove everything except for  but that preserves all the style elements of the tag.

I want the above to be replaced with just 

What's the best approach?

Thanks!

Solution

The simplest solution for this would be something based on preg_replace().

$html = 'SOME TEXT<p style="border-top: 0.0px;border-right: 0.0px;vertical-align: baseline;border-bottom: 0.0px;color: #000000;padding-bottom: 0.0px;padding-top: 0.0px;padding-left: 0.0px;margin: 0.0px;border-left: 0.0px;padding-right: 0.0px;background-color: #ffffff;">SOME TEXT';
$html = strip_tags($html, '<p>');
$html = preg_replace('/\sstyle=["\'][A-Za-z0-9-:\s.;#]{1,}["\']/', '', $html);

As always, you should always be somewhat careful when trying to parse html with regex. For instance, this would fail if for some reason the text inside the  tag contained something formatted like a css style. (Something like If I typed style="color:red" inside the tags, it would also be removed)

The next step to make something like this better would be to actually parse the string as an XML document using the DOMDocument class. It depends on how robust a feature set you are looking to achieve. However, this method could change your string in unexpected ways; for instance, parsing your string as a DOMDocument would cause a  tag to be added. That kind of validation may or may not be useful for you.