As I understand from various blogs that sites like 2captcha is a human-powered image and CAPTCHA recognition service. It's main purpose is solving your CAPTCHAs in a quick and accurate way by human employees who are always online to receive my captcha and solves the same on their end.
Now lets take an example of https://www.google.com/recaptcha/api2/demo. Say a captcha was generated, 2captcha like services needs data-sitekey
which are generated for every captcha.
data-sitekey="6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-"
Now what I don't understand here is that how does captcha solver works replicate/reproduce the captcha on their end using just the data-sitekey. Is there any service provided by google to replicate the same?
How does the human on other end receives the same captcha on their side, solves it and sends it back?
This is quit late to answer this, but still this may help somebody in future.
I also had this question in my mind and I started analysing it. I went through several websites, blogs and research papers and found how it works internally.
So below are the things that I understood from captcha implementation.
data-sitekey
is associated with the website and before loading the captcha, google verifies if this key is coming from associated domain by verifying document.location.hostname
.g-recaptcha-response
token which is nothing but the captcha solution based on your browser history, google.com cookies and other browser data.shared secret key
between Google and your website.How these captcha solver services works
data-sitekey
and website-url
from user.data-sitekey
.hosts
file by adding an entry of the user provided website-url
and point it to 127.0.0.1
website-url
as it is pointing to 127.0.0.1
. This way, google will consider the request is coming from valid website and it will generate the reCaptcha.g-recaptcha-token
is generated and is valid for ~120 seconds, this token will then given back to user for further steps.text-area
which has an id of g-recaptcha-response
and then submit the page.References
I have explained this working in my youtube video Selenium automation of a website having google recaptcha .
The source doesn't exists on github because I deleted my github account. If I can recover the source code, I will add it in my gitlab repository NiRRaNjAN RauT · GitLab.
Research paper I’m not a human: Breaking the Google reCAPTCHA
Based on this knowledge, I have build my own captcha solver service Fast Captcha Solver in affordable price.