One of the methods I use to filter bots from real users is checking the useragent. After detection I block the IP they are using. I'm seeing quite a lot of visitors coming in with 'Google Web Preview' embedded in their useragent (example):
mozilla/5.0 (x11; linux x86_64) applewebkit/537.36 (khtml, like gecko; google
web preview) chrome/41.0.2272.118 safari/537.36
When I check the IP addresses related to this useragent, they don't seem to be related to Google. They are all just household IP addresses coming from all over the world. When I follow the user on my website I notice his useragent changes to as soon as it continues to browse my site:
mozilla/5.0 (ipad; cpu os 10_3_3 like mac os x) applewebkit/602.1.50 (khtml,
like gecko) gsa/33.0.164895372 mobile/14g60 safari/602.1
Question: is this a bot or is Google using a visitor to generate a preview? The IP is behaving just like a regular user would (e.g. clicking on links, blocked by robots.txt)
Thanks!
When you open in chrome a new tab there are some most visited links below the google search input.
That preview images are generated so that the web page is loaded, and when the web page is loaded the user agent contains the "Google Web Preview"
So i would say it is a bot.