Does the HTML <link rel="canonical">
refer to URL or content or both?
I have a part of a website done in pure HTML + CSS + JavaScript and no server side. When a user enters the site with the root URL, the /index.htm
is loaded. The root index.htm
redirects to /site1/index.htm
.
I would like to indicate that the canonical URL for /site1/index.htm
should be /index.htm
, and the canonical URL for that, in turn, should be /
, so if needed at a later time, the redirect can go elsewhere. In this sense, specifying a canonical URL is intended to indicate that users should always enter the site through the specified path if possible when arriving at /site1/index.htm
.
I'm wondering if specifying <link rel="canonical" href="/index.htm">
in /site1/index.htm
, and <link rel="canonical" href="/">
in /index.htm
would accomplish this. (I'm aware that absolute URLs are recommended, but this may not always be possible.)
The web server could be IIS, Apache, or other. I can't touch the server config or headers or htaccess.
Can this be done in HTML or possibly JavaScript? (I'm aware that JavaScript won't affect SEO, but it may have something to do with the redirect. Currently, the redirect is done using both meta refresh and JavaScript location = ''
, with a fallback link for the user to click. As mentioned, can't touch headers, or server config.)
Further, if <link rel="canonical">
is used in said fashion, would search engines index the content of the target in place of the specifying page? For example, would search engines assume the content of /site1/index.htm
is the same as /index.htm
, so that the URL /site1/index.htm
would get associated with the actual contents of /index.htm
?
I'm new here so I don't know if this is out of topic or not but I'll try to answer the question.
The <link rel="canonical">
is kind of straight forward. It works like this.
When a search engine spider crawls your page it tells him what URL should be indexed for that particular page. It's very usefull in cases of possible different URL access to a particular content. (one example within others non-www and www URLs)
Exemple : You have multiple products pages for a specific category on your website because you use pagination. In this case you will have several URLs for your paginated content page 1, page 2, page 3, etc... Adding a <link rel="canonical">
tag pointing to the first page to all these pages will tell the search engine that it should index the first page only instead of indexing all paginated pages.
Basically your telling the spider don't index this URL index that other URL instead.
In your particular case /index.htm
is most probably a 301 redirection to /site1/index.htm
. The risk is that Google won't index your page because you are telling it not to index the content on /site1/index.htm
and index index.htm
instead but this page has no content because it provides a redirection.
I'm aware that you stated that you have no access to the .htaccess file but the only way I can thing of without touching your folders structure on your FTP is to use .htaccess to rewrite /site1/index.htm
to /index.htm
and then add the canonical tag just to be safe because having a canonical tag is a good practice.