Search code examples
webserverurl-encoding

is it almost a standard that web applications assume the query string to be in UTF-8?


when we type in or paste it to a browser's address bar:

http://www.google.com/search?q=%E5%A4%A9

i think there is no way to tell whether the encoding is UTF-8 or any other encoding, so the application will usually assume it is UTF-8. So is it entirely up to the app to interpret it as whatever encoding it wants to or assumes to be?

(for all websites and even the platform i worked on, they seems to be almost always UTF-8)

Update: changed to the webapp instead.


Solution

  • RFC 3986 says:

    "When a new URI scheme defines a component that represents textual data consisting of characters from the Universal Character Set [UCS], the data should first be encoded as octets according to the UTF-8 character encoding [STD63]; "

    So UTF-8 is definitely the way to go for any new HTTP GET apis.