Search code examples
phppreg-replacesanitizationstrip-tags

Remove HTML tags and square braced tags in text, but do not remove PHP tags or their contents


I need help with preg_replace(). See below:

<html>[sourcecode language='php']<?php echo "hello world"; ?>[/sourcecode]</html>

I only want it to display the PHP tags and strip the rest out, so I would get the following result:

<?php echo "hello world"; ?>

I have tried the following:

$update = get_the_content(); 
                                        
$patterns = array();
$patterns[0] = '/<html>/';
$patterns[1] = '/</html>/';
$patterns[2] = '/[sourcecode language]/';
$patterns[3] = '/[/sourcecode]/';
$replacements = array();
$replacements[0] = '';
$replacements[1] = '';
$replacements[2] = '';
$replacements[3] = '';

echo preg_replace($patterns, $replacements, $update);

But it doesn't work. My issue, also, is that the language might not always be PHP.


Solution

  • You need to escape chars like / when using / as a delimiter and [] as they have uses in regex:

    $update = get_the_content(); 
    
    $patterns = array();
    $patterns[0] = '/<html>/';
    $patterns[1] = '/<\/html>/';
    $patterns[2] = '/\[sourcecode language\]/';
    $patterns[3] = '/\[\/sourcecode\]/';
    $replacements = array();
    $replacements[0] = '';
    $replacements[1] = '';
    $replacements[2] = '';
    $replacements[3] = '';
    
    echo preg_replace($patterns, $replacements, $update);