Search code examples
phpjavascriptsecuritybotscsrf-protection

Web Form BOT Protection transparent to user


I was thinking a lot for last few days on how to protect the web form that Bots uses. The usage is kindly abuse, around 800k bot's queries in ~8hours.

Let's take a quick situation overview, any missing info - please ask for.

The bot:

  1. The bot have different IPs.
  2. The bot changes it's user agents to the really existing ones.
  3. The unknown point whether the bot loads js and have a cookie or not.

The problems:

  1. The form couldn't use hidden token field as may be submit from outside resources. The resourses such as different websites, that doesn't know about CSRF tokens, and can't generate them. Making impossible to use CSRF.
  2. The website MUST be cached in browser and the cache maybe reset only under exceptional situations, like suspected behavior.
  3. Database can't be used intensively(!).

The way it is now:

  1. Cookie counter with expiration hashed into something + additional chars only systems knows when they inserted.
  2. If browser couldn't handle cookies, database logging used. Here is some difficulty with browser cache, when user doesn't reach the server - result: verification is not running, counter is not incremented.
  3. reCaptcha applied for user who exceeds attempts limit during X time.

The ideas that came up:

  1. Serving iframe with some content and expires 0. iframe making simple cookie logic.
  2. Iframe : if cookie is not set - set it, if cookie is set, verify. if user didn't exceed the limit - set counter +1, if exceeded - send to specific page, that will show the warning with cache reset.

The difficulty here, what if bot doesn't support cookie and the content being served from cache... the db doesn't write anything as the user doesn't reach the server. However if user changes keyword, it will reset the cache and the logic behind will work.

The second difficulty: what if bot doesn't support JS (he will be thrown out when he switch keyword). but, cant be redirected while content served from cache.

The third difficulty: What if bot deciphers ReCAPTCHA ? :)

*The Questions: *

What you were doing in this situation ? Please describe the steps you are thinking. Really appreciate your point of view on the things. Every idea will may be refined with other ideas and we can come up with a great protection scheme! Thank you, guys!


Solution

  • So My Idea to fight the user's cache was:

    Using iframe 1x1.

    it's content being sent with Expires: 0, the iframe is served every time even when page loaded from cache.

    Another idea i just came up is to record mouse events. the onmousemove and onkeydown, these two catchs even F5 keydown. report to a server and set the flag.

    FINAL RESULT It is decided to use cloaked CSS, that sets system's variable that the user is loading contents normally. However, if "user" loading content normally, exctra protection is to implement javascript's events tracking (onmousemove,onkeydown,onclick) and receive send it to server to flag it. The request sends to server only once, when event first occurred and then doesn't track.