Search code examples
urlfriendly-url

What are the safe characters for making URLs?


I am making a website with articles, and I need the articles to have "friendly" URLs, based on the title.

For example, if the title of my article is "Article Test", I would like the URL to be http://www.example.com/articles/article_test.

However, article titles (as any string) can contain multiple special characters that would not be possible to put literally in my URL. For instance, I know that ? or # need to be replaced, but I don't know all the others.

What characters are permissible in URLs? What is safe to keep?


Solution

  • To quote section 2.3 of RFC 3986:

    Characters that are allowed in a URI, but do not have a reserved purpose, are called unreserved. These include uppercase and lowercase letters, decimal digits, hyphen, period, underscore, and tilde.

      ALPHA  DIGIT  "-" / "." / "_" / "~"
    

    Note that RFC 3986 lists fewer reserved punctuation marks than the older RFC 2396.