Search code examples
character-encodingurl-encoding

percent encoding - how is a greek letter percent encoded?


I'm browsing the web for a answer but cannot find one. I have a HTML form (method=GET) and submit in a text field the text helloΩ (hello with the greek letter Omega appended)

The URL in the browser encodes it as:

mytext=hello%26%23937%3B

Without the greek letter Omega appended, I get (as expected):

mytext=hello

So how is the greek Omega letter percent encoded into:

%26%23937%3B

Thanks


Solution

  • This happens when your web server declared an encoding that doesn't support the character. For example, ISO-8859-1 doesn't support it which is the default encoding for many web servers.

    That's a html entity character reference percent-encoded: Ω, because #, & .. are all ASCII characters, this is the only way to not lose information because the browser thinks the server only supports ISO-8859-1.

    To fix this, declare UTF-8 in your http header:

    Content-Type: text/html; charset=utf-8
    

    This isn't even consistent behavior between browsers, because IE encodes it as hello%D9, which is Ú in CP1252/ISO-8859-1.