Search code examples
javascriptsecurityweb-scrapingchrome-devtools-protocol

How to Detect Web Scrapers Using Chrome DevTools Protocol (CDP) Instead of Selenium or Puppeteer?


I'm experiencing an issue where my website is being targeted by web scraping bots. It appears that the attackers are controlling Chrome browsers using the Chrome DevTools Protocol (CDP) directly, rather than relying on automation frameworks like Selenium or Puppeteer. As a result, traditional browser fingerprinting methods aren't revealing any unusual characteristics or anomalies.

I've Tried:

  1. Implementing standard browser fingerprinting techniques

Challenges:

  1. CDP-controlled browsers mimic regular user browsers closely, making it difficult to detect using conventional methods.
  2. Lack of distinctive fingerprints or behavioral anomalies.

Question: What strategies or techniques can I use to effectively detect and mitigate web scrapers that are controlling Chrome browsers via the Chrome DevTools Protocol (CDP) instead of using automation tools like Selenium or Puppeteer? Are there specific indicators or advanced methods that can help identify such sophisticated scraping attempts?


Solution

  • There is no general answer which applies long-term to this question. Bot detection is a cat-and-mouse game and usually involves more than one detection vector.

    A good start might be fingerprinting, as already mentioned, such as CreepJS.

    Harder detection "signals" are usually specific to an automation framework and do not apply to CDP in general. See Brotector for example.

    Disclaimer: I'm the author of brotector