Search code examples
node.jspuppeteerrecaptchachromiuminvisible-recaptcha

puppeteer - identify when reCaptcha challenge becomes active/visible


Using this reCaptcha demo page: https://recaptcha-demo.appspot.com/recaptcha-v2-invisible.php

Using puppeteer, my goal is to wait for and identify when the challenge (having to pick specific images from a grid) becomes visible on the page. I am NOT asking how to bypass or solve the reCaptcha, just to know when it is active and ready to be solved.

Via DevTools, I have found the HTML elements representing the visible reCaptcha challenge: recaptcha element tree

Unfortunately, I have been unable to get puppeteer to "find" the specific elements. The following code will ALWAYS print "NOT found" even when the reCaptcha is clearly visible in the browser and the #rc-imageselect element is visible in the element tree. I have experimented with the main frame, child frames, etc and have been unable to get puppeteer to find the reCaptcha elements.

let recap = await myframe.$("body #rc-imageselect")  //.rc-imageselect-payload") // #rc-imageselect
if (recap == null) {
  console.log("imageselect NOT found")
} else {
  console.log("imageselect found")
}

Why is this necessary? On real-world pages (not this demo page), the reCaptcha challenge won't be triggered for some users, and will only pop up for some. My goal is identify when it pops up, then choose how to handle the reCaptcha, either by solving (separately, by hand), backing off, or abandoning entirely.

Any help with puppeteer code to find the reCaptcha elements would be greatly welcomed. Thank you.


Solution

  • There are some errors which never goes to the console so remains unrecognized and caused by security issues inside iframes (which is always a critical point with Chrome/Chromium browser). You will need the following security disabling args to launch puppeteer, because due to the same-origin policy you are not allowed to go inside the iframe by default.

    const browser = await puppeteer.launch({
        headless: true, args: ['--disable-web-security', '--disable-features=IsolateOrigins,site-per-process']
    })
    

    These args will be always required in case of reCaptcha scenarios.