Search code examples
javascriptphpstrip

Is there any way of stripping all javascript from a string in PHP?


I have the following php code:

$mystr = "<script>window.onload = function(){console.log('Hi')}</script>";
$mystr .= "<div onmouseover='alert('Hi')'></div";

What i want is to strip all kind of javascript from $mystr.

I am trying the following code but it keeps the onmouseover event.

$mystr = strip_tags($mystr,'<div>');

I want to remove the onmouseover or any othe inline javascript code too.

I am actually trying to achieve the above in wordpress. so as far as i know there is no HTML Purifier in wordpress.


Solution

  • That's how strip_tags works, eg:

    $html = '<foo>hello<bar>world</bar></foo>';
    $fixed = strip_tags($html, '<bar>');
    echo $fixed;
    

    outputs:

    hello<bar>world</bar>
    

    It doesn't understand the DOM, it doesn't understand javascript. it's essentially doing:

    $fixed = str_replace('<script>', '', $html);
    

    The only "smarts" it has is recognizing that tags can have attributes and deleting those as well.

    If you want to remove a tag and all of its contents, then you should be using a DOM parser, and deleting the unwanted nodes (aka tags) and their children from the tree entirely.