Search code examples
perldata-mininggoogle-search-api

Parsing Google's search results


I'm "working" on a data mining project and I've chosen to parse Google search results. Now before I actually start, I want to consult you - experienced folks. I did a bit of research on how Google delivers results and I analyzed structure of a result page. That's all alright, I've already figured out regexes and data structures I'll use.

In between I encountered their CAPTCHA because I was searching too fast; oh, the irony. I've also discovered that they limit results to 1000 actually. Now, is there any way I could avoid those peripeties, perhaps slowing the rate of url fetching to solve the first one or reporting when encountering CAPTCHA so that it waits for my input; that might do it, but what about the other one ? Does Google provide some kind of an API that I can use for a workaround? I couldn't find one on their code.* page.


Solution

  • Always look on CPAN first!

    https://metacpan.org/pod/REST::Google

    If someone hasn't already solved your problem, chances are it's a weird one :-)