Search code examples
google-custom-search

How long does Google continue polling a linked CSE specification file after it's requested?


When you create a Google Custom Search Engine (CSE) with a linked specification file on your server, Google's "FeedFetcher-Google-CoOp" bot requests that file in order to build the CSE. It appears that even after results have been returned to the user and the specification file is no longer used, Google continues polling it regularly for at least several days.

My question is how long Google will continue polling the file after it has stopped being requested by your CSE code, and if there is any way to force it to stop immediately.

(We created a dynamic linked CSE that was unique to each query, which meant many, many specification files (the same script with different GET arguments each time) were requested. Now that we are no longer using them, FeedFetcher-Google-CoOp continues to request this script with various past arguments.

FeedFetcher-Google-CoOp ignores robots.txt. We are now returning 410: Gone for all requests, but it is difficult to tell whether this is having an effect, since there are so many different versions being requested (ie: /script.php?query=). Ideally there would be some way to tell Google that script.php does not exist, regardless of arguments, but without robots.txt, I can't find a way to do so.

TL;DR: 1) Will Google stop requesting this script on its own eventually? If so, when? 2) Is there a way to stop it requesting immediately?


Solution

  • If left alone, it appears Google will continue requesting these files indefinitely (at least for months). It ignores 410 (gone) responses, but it appears that it respects 301 redirects! So to stop Google trying to request outdated CSE specifications, you can 301 redirect them to a null file. Google will likely still try to access the file again for every set of arguments it has cached, but should stop trying after that.