We want to setup a little honeypot image in our html bodies to detect scrapers / bad bots.
Has anyone set something like this up before?
We were thinking the best way to go at it would be to:
a) Comment the html out via:
<!-- <img src="http://www.domain.com/honeypot.gif"/> -->
b) Apply css styles to the image that would make it hidden from browsers via:
.... id="honeypot" ....
#honeypot{
display:none;
visibility:hidden;
}
Using the above does anyone foresee any situations where a proper and real useragent would pull the image / attempt to render it?
The honeypot.gif would be a mod_rewritten php script where we would do our logging.
While I understand that the above 2 conditions might be skipped by any well coded scraper, it would at least shed some insight on the very dirty ones.
Any other pointers as to the best way to go at this?
A bot will ignore your img tag because it's within a comment.
Instead, you might consider creating an invisible div which contains a link to a trigger URL on the same site (preferably within the same directory, in case the bot is depth sensitive).