I'm trying to use preg_replace()
to sanitize poorly written XML.
$x = '<abc x="y"><def x="g">more test</def x="g"><blah>test data</blah></abc x="y">';
The logic is to check if there's a space within a closing tag </ >
and delete everything from the space to the end of the tag.
Desired result:
<abc x="y"><def x="g">more test</def><blah>test data</blah></abc>
This should do it:
preg_replace('/<\/(\w+)\s*[^>]*>/', '</\1>', $x);