Search code examples
httpurlcanonical-link

Is there a preferred canonical form for the path part of URLs?


All of these URLs are equivalent:

The "rel='canonical'" link allows me to specifiy whichever I want.

Is one of those forms considered "better" or "more standard" than the others?

As a maintainer, I personally prefer the first one, as it allows me the freedom to change "Me" to be "Me.php", or change "index.html" to be "index.shtml", or some other form should I ever need to, without having to define redirects, or to change any existing links to this URL. (This isn't specific to "index"; it could be for any web page.)

I.e. using that simplest form avoids publishing what is only an implementation detail that is best hidden from the users.

Unfortunately, of all the forms, my preferred choice is the only one that web servers don't like; they return "HTTP/1.1 301 Moved Permanently" and add the trailing "/".

  • For directories, is incurring this redirection penalty worth it?
  • For non-directories, is there any reason I shouldn't continue omitting the suffix?

Added after receiving the answer:

  • It's nice to know I'm not the only one that thinks omitting suffixes is a good idea.
  • And I just realized that my problem with directories goes away if I use "directoryname/index" as the canonical form.
  • Thanks.

Solution

  • For directories, is incurring this redirection penalty worth it?

    No.

    "The canonical URL for this resource is a 301 redirect to another URL" doesn't make sense.

    For non-directories, is there any reason I shouldn't continue omitting the suffix?

    No.

    There is a reason to omit the suffix: It leaks information about the technologies used to built the site, and makes it harder to change them (i.e. if you moved away from static HTML files to a PHP based system, then you'd need to redirect all your old URLs … or configure your server to process files with a .html extension as PHP (which is possible, but confusing).