Search code examples
htmlformscharacter-encodingmailtoenctype

url encoding, Form encoding and mailto: encoding


I am a little bit confused about the whole encoding issues related to HTML. I am not refering to the charset in the headers or encoding in the XML prologue. That I get. Lets me explain.

When the "mailto:" is used along with a anchor or a submit button in a form, white space is encoded as "%20" and "line feed/carriage return/new line/end of line" is encoded as %0A. While when the enctype attribute is used on a form with a value of "application/x-www-form-urlencoded" the white space is encoded as "+" and special characters, apostrophes, percentage and other symbols are converted to their ASCII HEX equivalents. Is the value "application/x-www-form-urlencoded" an URL Encoding? So why "%20" for the first one and "+" for the second.

"mailto:[email protected][email protected]&[email protected]&subject=This%20is%20the%20subject&body=This%20is%20the%body%0AThis%20is%20the%20second%20paragraph"

In the above example white space in the subject is encoded as %20 and new line in the body is encoded as %0A.

<form enctype="application/x-www-form-urlencoded"></form>

And in the above white space will be encoded to "+". Am I missing something?

Thanks in advance.


Solution

  • URIs (like your mailto example) should be encoded according to RFC 3986, which specifies that spaces are to be encoded as %20.

    The format of FORM data, on the other hand, is encoded as application/x-www-form-urlencoded according to the rules defined by the HTML specification. (See, for example, section 17.13.3.3 of the HTML 4.01 specification.) This specifies that spaces are to be translated as + signs.

    Thus, while percent encoding is similar between URIs and form data, the space character is treated differently.