Search code examples
phpsecurityhtmlspecialchars

htmlspecialchars - different escaping for attributes compared to everything else?


I have been reading up on htmlspecialchars() for escaping user input and user input from the database. Before anyone says anything, yes, I am filtering on db input as well as using prepared statements with bindings. I am only concerned about securing the output.

I am confused as to when to use ENT_COMPAT, ENT_QUOTES, ENT_NOQUOTES. I came across the following excerpt while doing my research:

The second argument in the htmlspecialchars() call is ENT_COMPAT. I've used that because it's a safe default: it will also escape double-quote characters ". You only really need to do that if you're outputting inside an HTML attribute (like <img src="<?php echo htmlspecialchars($img_path, ENT_COMPAT, 'UTF-8')">). You could use ENT_NOQUOTES everywhere else.

I have found similar comments elsewhere as well. What is the purpose of converting single and/or double quotes for attributes yet not converting them elsewhere? The only thing I can think of is if you were adding actual html into the page for instance:

My variable is : <img src="somepic.jpg" alt="some text"> if you converted the double quotes here it would not render properly because of the escaped quotes. In the example given in the excerpt though I can't even think of an instance where any type of quote would be used.

Secondly, in this particular reference it says to use ENT_NOQUOTES everywhere else. Why? My personal thought process is telling me to use ENT_QUOTES everywhere and ENT_NOQUOTES if and only if the variable is an actual html attribute that requires them.

I've done lots of searching and reading, but still confused about all of this. My main goal is to secure output to the page so there is no html, php, js manipulation happening.


Solution

  • Just use ENT_QUOTES everywhere. PHP gives the option in case you need it, but 99% of the time you don't. Escaping the quotes unnecessarily is harmless.

    htmlspecialchars($string, ENT_QUOTES, 'UTF-8');
    

    Because that code is just too long to keep writing everywhere wrap it in some tiny function.

    function es($string) {
      return htmlspecialchars($string, ENT_QUOTES, 'UTF-8');
    }