Search code examples
phpxsscode-injectionhtmlspecialchars

How to display htmlspecialchars on the html?


I've read here:
https://stackoverflow.com/a/8932454/4301970
that htmlspecialchars() is very effective preventing xss attacks.

I'm receiving formated text from a wysiwyg editor, for example:

<p>
    <em>
        <strong><span style="font-size:36pt;">test</span></strong>
    </em>
</p>

Encoding this on my html:

<!DOCTYPE html>
<html lang=en>
<head>
    <title></title>
</head>
<body>
<?php echo htmlspecialchars('<p><em><strong><span style="font-size:36pt;">test</span></strong></em></p>', ENT_QUOTES); ?>
</body>
</html>

Will output on browser:

<p><em><strong><span style="font-size:36pt;">test</span></strong></em></p>

How can I display the formatted text correctly, while preventing XSS injections?


Solution

  • The htmlspecialchars encodes all characters that have (or could) special meanings in XML, specifically <, >, &, ", and ' (if ENT_QUOTES is set).

    So with this setting any malicious code attempts would not be rendered by the browser.

    For example

    <script>alert('bam');</script>
    

    would be

    &lt;script&gt;alert('bam');&lt;/script&gt;
    //or with quotes constant
    &lt;script&gt;alert(&#039;bam&#039;);&lt;/script&gt;
    

    which JS won't process. So that can be an affective means of stopping XSS injections. However you want users to submit some HTML so you will need to make a kind of whitelist of approved elements. You can do that by replacing the <> with custom text that won't occur in your users inputs. In my below example I've chosen custom_random_hack. Then run everything through the htmlspecialchars which will encode all special characters. Then convert your swapped elements back to their HTML elements.

    $string = '<p>
        <em>
            <strong><span style="font-size:36pt;">test</span></strong>
        </em>
    </p>';
    $allowedtags = array('p', 'em', 'strong');
    echo '~<(/?(?:' . implode('|', $allowedtags) . '))>~';
    $string = preg_replace('~<(/?(?:' . implode('|', $allowedtags) . '))>~', '#custom_random_hack$1custom_random_hack#', $string);
    echo str_replace(array('#custom_random_hack', 'custom_random_hack#'), array('<', '>'), htmlspecialchars($string, ENT_QUOTES));
    

    Demo: https://eval.in/582759