Search code examples
phpipdetectionbots

ip blocker class blocks "crawl-66-249-76-64.googlebot.com", is that correctly?


I wrote, a very strong protection class "BlockIp", that can use a blacklist with IPs, can detect strange IP configuration and can also block proxies.

When it finds one, I get a detailed email about the visitor, why it was blocked and what they were trying to do (once a day of course). It seems that it is working very well because I received some real attacks in the past that were blocked by this class. It does not block legal bots, but it's not easy to test whether the detection method is correct.

Today I received an email from the class that it has blocked "crawl-6-249-76-64.googlebot.com", which identifies itself as a Google robot. I searched the net to check whether it was blacklisted but did not find it blacklisted anywhere. I found that the IP is listed at many sites when googling for "66.249.76.64".

I received two error e-mails from the class. The first one is when the "bot" tries to access "robots.txt", and the second one when it tries to access the root of the site.

My question is: Is this a Google bot or not? (If true, there is something wrong with the detection, and I have to fix that). Did not find the IP in the ip-range of Google: http://chceme.info/ips/

Here some information about the bot:

Ticket ID : {EVNT_117162_2013011220130110_32925_19904}
Event type : Access blocked
Event date : 01/12/2013 - 03:53:01 (server date-time)
Event counter : First occurring
Processed url : mysite/robots.txt
From url : Unknown or direct link
Domain : mysite
Domain IP : 000.000.000.000
Visitor IP : 66.249.76.64
Proxy IP : (not present)



Problem : Potential danger detected - 66.249.76.64
Hostname : crawl-66-249-76-64.googlebot.com
Block : Yes
Refferer : (direct access)
AgentString : Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Browser : Mozilla 5.0
Platform : Unknown Platform
Robot : Yes - Googlebot
Mobile : No
Tablet : No
Console : No
Crawler : Yes
Agent_type : crawler
Agent_name : googlebot
Agent_version : 2.1
Os_type : unknown
Os_name : unknown
Agent_languagetag : en
Status : ok
Request : 66.249.76.64
Languagecode : us
Country : United States
Region : California
City : Mountain View
Zipcode : 94043
Latitude : 37.3861
Longitude : -122.084
Timezone : -08:00
Areacode : 650
Dmacode : 807
Continentcode : na
Regioncode : ca
Currencycode : USD
Currencysymbol : $
Currencysymbol_utf8 : $
Currencyconverter : 1
Extended : 1

Solution

  • First of all, yes, this is google. You can verify the Google Bot as described here: https://support.google.com/webmasters/bin/answer.py?hl=en&answer=80553
    And by the way: " the first one is when the "bot"tries to access "robots.txt" " a bot should always be allowed to visit /robots.txt