Search code examples
c#asciiurlencodeutf-16

URL encode ASCII/UTF16 characters


I'm trying to URL-encode some strings, however I have problems with methods provided by the .Net framework.

For instance, I'm trying the encode strings that contain the 'â' character. According to w3schools for instance, I would expect this caracter to be encoded as '%E2' (and a PHP system I must call expects this too...).

I tried using these methods:

System.Web.HttpUtility.UrlEncode("â");
System.Web.HttpUtility.UrlPathEncode("â");
Uri.EscapeUriString("â");
Uri.EscapeDataString("â");

However, they all encode this character as: %C3%A2

I suppose this has something to do with the fact that strings in .Net are UTF-16 encoded. So to avoid this problem, I can write this for instance:

"%" + ((int)character).ToString("X")

However, I would like to know if the framework already has a built-in method (I can't find any answer here or elsewhere as to why my character is encoded this way)?


Solution

  • The reason is not that .NET uses UTF-16 encoded strings. The reason is that the UrlEncode(string) overload uses UTF-8 by default, and %C3%A2 is the correct UTF-8 encoding of â:

    The HttpUtility.UrlEncode method uses UTF-8 encoding by default. Therefore, using the UrlEncode method provides the same results as using the UrlEncode method and specifying UTF8 as the second parameter.

    If you prefer a different encoding (for example Latin-1 or Codepage 1252, where â corresponds to %E2), you can use another overload that allows you to specify an encoding:

    var x = HttpUtility.UrlEncode("â", Encoding.GetEncoding(1252));