Search code examples
phphtmlurlurlencodehtml-entities

URL/HTML escaping/encoding


I have always been confused with URL/HTML encoding/escaping. I am using PHP, so I want to clear some things up.

Can I say that I should always use

  • urlencode: for individual query string parts

      $url = 'http://test.com?param1=' . urlencode('some data') . '&param2=' . urlencode('something else');
    
  • htmlentities: for escaping special characters like <> so that if will be rendered properly by the browser

Would there be any other places I might use each function? I am not good at all these escaping stuff and am always confused by them.


Solution

  • First off, you shouldn't be using htmlentities() around 99% of the time. Instead, you should use htmlspecialchars() for escaping text for use inside XML and HTML documents.

    htmlentities are only useful for displaying characters that the native character set you're using can't display (it is useful if your pages are in ASCII, but you have some UTF-8 characters you would like to display). Instead, just make the whole page UTF-8 (it's not hard), and be done with it.

    As far as urlencode(), you hit the nail on the head.

    So, to recap:

    • Inside HTML:

      <b><?php echo htmlspecialchars($string, ENT_QUOTES, "UTF-8"); ?></b>
      
    • Inside of a URL:

      $url = '?foo=' . urlencode('bar');