Search code examples
phpscreen-scraping

screen scraping in php problem


i had made a screen scraping module which works very fine but with certain limitations.now i want to remove those boundations,but i got so unpredictable and different error. Before anything goes in ur mind let me wat is actually hapening. Initially i used screen scraping to retrieve result for a set of keyword(search content) google's all search engine like co.in/co.uk/nl/de/com.

But now i had to scrape the logic for multiple search engine and multiple keywords in a loop.

Lets check out this with an example:

keyword     se            company         rank
telephony google.co.in    airtel          01
telephony google.co.in    bsnl            04
telephony google.co.in    aircel          06
telephony google.co.in    idea            03
mobile op google.co.uk    airtel          09
mobile op google.co.uk    bsnl            04

and so.. for more than 6 keywords and all shown search engines and for all company.

Initially i was retreiving it for one keyword,se and all company.but now i have to make a list of all keywords,se,company. Simply i used loops to do that.But i faced these errors:

  1. memory allocated 343322111 bytes overflowed(...[to remove this i used ini_set('memory') func]
  2. after sum request google used capcha. To remove capcha i used sleep, or usleep() but it not solving purpose.atlast ERROR: connection reset. I cant use 30sec or more in usleep func.it will take hours to retreive info.My code search data for 5pages of google, that means 50responses.Lib using simple_html_dom.php

It works fine for 1page page but not for greater than 3pages.What should i do/use??


Solution

  • sleep() function with &num=100 in query solves the problem. Using &num=100 reduces the number of request to google 10times. and between every request i used 5 sec delay which google seems to be a valid,genuine,human request.