Search code examples
phphtmliframehtmlpurifier

Is is possible to ignore complete HTML within specified tag using HTMLPurifier


I have a use case where I need to store iframes and possible javascript generated dynamically in database. I am using HTMLPurifier to sanitize the input.

Is it possible in HTMLPurifier to ignore all content (HTML/CSS/JS) inside a specified element so it sanitizes everything but leave everything inside specified element intact?


Solution

  • From Wikipedia:

    CDATA sections in XHTML documents are liable to be parsed differently by web browsers if they render the document as HTML, since HTML parsers do not recognise the CDATA start and end markers, nor do they recognise HTML entity references such as &lt; within <script> tags. This can cause rendering problems in web browsers and can lead to cross-site scripting vulnerabilities if used to display data from untrusted sources, since the two kinds of parser will disagree on where the CDATA section ends.

    Since it is useful to be able to use less-than signs (<) and ampersands (&) in web page scripts, and to a lesser extent styles, without having to remember to escape them, it is common to use CDATA markers around the text of inline <script> and <style> elements in XHTML documents. But so that the document can also be parsed by HTML parsers, which do not recognise the CDATA markers, the CDATA markers are usually commented-out

    Here is the JavaScript Example:

    <script type="text/javascript">
    //<![CDATA[
    document.write("<");
    //]]>
    </script>
    

    Here is the CSS Example:

    <style type="text/css">
    /*<![CDATA[*/
    body { background-image: url("marble.png?width=300&height=300") }     
    /*]]>*/
    </style>