Search code examples
robots.txtsid

SID on frontend - pros/cons - disallow?


What is the pros and cons of having SID on frontend enabled? What would you recommend?

No matter what we believe they should not be indexed by Google? How do we make sure URLs like: http://www.uretilalt.dk/brands/copha-ure?SID=a44apq55dg17192fj345bnb6m6

And also: http://www.uretilalt.dk/kategorier/ure-med-laenke?limit=36

Are not indexed by Google??


Solution

  • With robots.txt you can disallow crawling of URLs with parameters.

    With meta-robots you can disallow indexing of URLs with parameters.

    With rel-canonical you can declare which URL should be the canonical one.

    If possible, you shouldn’t include the SID in the URL at all. Users might share the URL, might bookmark it, might submit it to some service, etc., which means you end up with many different URLs pointing to the same resource (which is something you should avoid). Furthermore, URLs with SIDs are ugly.

    If you have parameters that manipulate the output, like it’s probably the case for your ?limit=36, the best solution is to use rel-canonical:

    <!-- on <http://www.uretilalt.dk/kategorier/ure-med-laenke?limit=36> -->
    <link rel="canonical" href="http://www.uretilalt.dk/kategorier/ure-med-laenke" />
    

    However, you may only use this if the content on /kategorier/ure-med-laenke?limit=36 is identical to or a subset of the content on /kategorier/ure-med-laenke. For example, it probably doesn’t work if you use pagination on these pages.