Search code examples
phphttp-status-code-404search-engine-bots

Tell search engines that page does not exist


I have checked the logs and found that the search engines visits a lot of bogus URL's on my website. They are most likely from before a lot of the links were changed, and even though I have made 301 redirects some links have been altered in very strange ways and aren't recognized by my .htaccess file.

All requests are handled by index.php. If a response can't be created due to a bad URL a custom error page is presented instead. With simplified code index.php looks like this

try {
  $Request = new Request();
  $Request->respond();
} catch(NoresponseException $e) {
  $Request->presentErrorPage();
}

I just realized that this page returns a status 200 telling the bot that the page is valid even though it ain't.

Is it enough to add a header with 404 in the catch statement to tell the bots to stop visiting that page?

Like this:

header("HTTP/1.0 404 Not Found");

It looks OK when I tests it, but I'm worried that SE bots (and maybe user agents) will get confused.


Solution

  • You're getting there. The idea is correct - you want to give them a 404. However, just one tiny correction: if the client queries using HTTP/1.1 and you answer using 1.0, some clients will get confused.

    The way around this is as follows:

    header($_SERVER['SERVER_PROTOCOL']." 404 Not Found");