Search code examples
seoweb-crawlerrobots.txtnofollow

Possible to prevent search engine spiders from infinitely crawling paging links on search results?


Our SEO team would like to open up our main dynamic search results page to spiders and remove the 'nofollow' from the meta tags. It is currently accessible to spiders via allowing the path in robots.txt, but with a 'nofollow' clause in the meta tag which prevents spiders from going beyond the first page.

<meta name="robots" content="index,nofollow">

I am concerned that if we remove the 'nofollow', the impact to our search system will be catastrophic, as spiders will start crawling through all pages in the result set. I would appreciate advice as to:

1) Is there a way to remove the 'nofollow' from the meta tag, but prevent spiders from following only certain links on the page? I have read mixed opinions on rel="nofollow", is this a viable option?

<a rel="nofollow" href="http://www.mysite.com/paginglink" >Next Page</a>

2) Is there a way to control the 'depth' of how far spiders will go? It wouldn't be so bad if they hit a few pages, then stopped.

3) Our search results pages have the standard next/previous links, which would in theory cause spiders to hit pages recursively to infinity, what is the effect of this on SEO?

I understand that different spiders behave differently, but am mainly concerned with the big players, such as Google, Yahoo, MSN.

Note our Search results pages and paging links are not bot-friendly, in that they are not re-written and have a ?name=value query string, but from what I've seen spiders no longer just abort when they see the '?' as the results pages ARE getting indexed with decent page rank.


Solution

  • To be honest you are looking at nofollow wrong. Chances are the search spiders are already especially Google, Yahoo, and MSN searching the nofollow pages, because they still have to hit those pages to see if they have a noindex.

    The real problem is nofollow doesn't actually mean don't follow, it just means don't pass on my reputation to this link. So unless you are aggressively blocking bots, which it doesn't sound like you are, changing the ROBOTS meta tag and robot commands on links will not effect performance because they are already hitting your site. To confirm this just look at your HTTP Server Log.

    So my vote is that you will not see any problem with removing the robot limits.