Search code examples
pythonperlimagemagickscreen-scrapingmechanize

Virtual screen scraping with Perl


Is it possible to do virtual screen scraping in Perl, or Python? Suppose I have a login page, where once I enter the username or password, it takes me to another authentication page where I have to write what a Captcha reads. Now with Mechanize or a similar tool I can automate the first step. However, for the 2nd step, is it possible to capture a screenshot of the captcha page (virtual, since we are not really loading the page in a browser) through Perl? Once it is done, perhaps I can automate a captcha reading tool (Google has one), which will attempt to read it. (All such captcha pages will have the captcha image in a fixed place within a fixed size box, so I can use Imagemagick to crop that part of the screenshot and feed to the Google tool. It will take a few trial and error runs to find out which portion of the screenshot contains the captcha). So is it possible?


Solution

  • You don't need to emulate or do anything with screen at all. Just trace where CAPTCHA requests its image data and download it yourself - you'll have ready image file.