Search code examples
javahttpwebrequesthttprequesthtmlspam-prevention

How to know whether a URL request for my website is from browser or from a automated program


My requirement is to know whether a request to my webpage is a genuine request (through browser ) or a automated request that is generated through some Java program. Where can I differentiate the request type?

Actually I need to block all the requests that are generated from program hence searching for the difference.


Solution

  • There is no fool proof way of doing this. The most effective solution for me was:

    1. Implement a User Agent check at the web server level (Yes this is not fool proof). Target to block out the known / common programs that people use to hit URLs. Like libperl, httpclient etc. You should be able to build such a list from your access logs.

    2. Depending on your situation, you may or may not want search engine spiders to crawl your site. Add robots.txt to your server accordingly. Not all spiders / crawlers follow instructions from robots.txt, but most do.

    3. Use a specialized tool to detect abnormal access to your site. Something like https://www.cloudflare.com/ which can track all access to your site, and match it with an ever growing database of known and suspected bots.

    Note: I am in no way affiliated to cloudflare :)