As far as I know, URL encoding exists because URLs only support ASCII encoding. But since "
is already in the ASCII table, why should it be encoded as %22
in URL encoding?
The "
character falls under section 2.2 (URL Character Encoding Issues) of RFC 1738 (Uniform Resource Locators), under the "Unsafe" section. The reason for the inclusion is:
The quote mark (
"""
) is used to delimit URLs in some systems.
One case of this that I can think of is an HTML attribute. For example, if you have an <a>
tag with an href
attribute, you will likely enclose the URL between double quotes. If the "
character is not quoted, then the tag becomes invalid:
<a href="https://example.com/this"should-be-quoted">...</a>
The RFC also proceeds to say:
All unsafe characters must always be encoded within a URL.
Some examples of other unsafe characters:
The characters
"<"
and">"
are unsafe because they are used as the delimiters around URLs in free text.The character
"%"
is unsafe because it is used for encodings of other characters.The character
"#"
is unsafe and should always be encoded because it is used in World Wide Web and in other systems to delimit a URL from a fragment/anchor identifier that might follow it.