Search code examples
htmlencodeurlencode

Why are double-quotes urlencoded as %22?


As far as I know, URL encoding exists because URLs only support ASCII encoding. But since " is already in the ASCII table, why should it be encoded as %22 in URL encoding?


Solution

  • The " character falls under section 2.2 (URL Character Encoding Issues) of RFC 1738 (Uniform Resource Locators), under the "Unsafe" section. The reason for the inclusion is:

    The quote mark (""") is used to delimit URLs in some systems.

    One case of this that I can think of is an HTML attribute. For example, if you have an <a> tag with an href attribute, you will likely enclose the URL between double quotes. If the " character is not quoted, then the tag becomes invalid:

    <a href="https://example.com/this"should-be-quoted">...</a>
    

    The RFC also proceeds to say:

    All unsafe characters must always be encoded within a URL.


    Some examples of other unsafe characters:

    The characters "<" and ">" are unsafe because they are used as the delimiters around URLs in free text.

    The character "%" is unsafe because it is used for encodings of other characters.

    The character "#" is unsafe and should always be encoded because it is used in World Wide Web and in other systems to delimit a URL from a fragment/anchor identifier that might follow it.