python security neural-network conv-neural-network recurrent-neural-network

Captcha Security and Deep Learning

I came across this research paper-http://www.cs.sjsu.edu/~pollett/papers/neural_net_plain.pdf.

These researchers have come up with a way to break character-based CAPTCHAs and it seems they have succeeded as they have used 13 million captchas for training the CNN they made and got accuracies higher than 95%.

How can we make a CAPTCHA secure so that it isn't bypassed by a deep learning model?

Solution

First of all, captchas are meant to stop automated users/bots. Yes, if you have the actual captcha generator, and you train a deep learning model on that distribution, chances are it will perform well.

Captchas are getting harder, they can be made even harder. But, it takes resources to generate the captchas, actual computational resources (unless they are random images and not synthetic). If it is needed to make a really bot-proof website, it can be made.

By bot, it usually means web scraping tools/automated users, who try to do things like human users, but very fast. Now, if you also integrate, deep learning models to it, it's possible to bypass the captchas (in most cases), but it may be an overkill (depending on your needs). Saving websites from bots is less important than facial recognition, self-driving cars (relative statement).