Search code examples
phpasp.netcurlscreen-scraping

Screen Scraping of aspx page using curl


I am using this code, and its not working. Please help

$url = "http://www.riogrande.com/Category/Findings-and-Finished-Jewelry/132/Bails-and-Enhancers/472";
$file=file_get_contents($url);
preg_match("#.*?#mis", $file, $arr_viewstate);
$viewstate = urlencode($arr_viewstate[1]);
$eventvalidation = urlencode($arr_viewstate[2]);
$options = array(
    CURLOPT_RETURNTRANSFER => true, // return web page
    CURLOPT_HEADER => false, // don't return headers
    CURLOPT_ENCODING => "", // handle all encodings
    CURLOPT_USERAGENT => "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7'", // who am i
    CURLOPT_AUTOREFERER => true, // set referer on redirect
    CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
    CURLOPT_TIMEOUT => 1120, // timeout on response
    CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
    CURLOPT_POST => true,
    CURLOPT_VERBOSE => true,
    CURLOPT_POSTFIELDS => '__EVENTTARGET='.urlencode('ctl00$ContentPlaceHolderBody$SearchPageNavigationTop$rptPager$ctl01').'&__EVENTARGUMENT='.urlencode('').'&__VIEWSTATE='.$viewstate.'&__EVENTVALIDATION='.$eventvalidation.'&__LASTFOCUS='.urlencode('')
);

$ch = curl_init($url);
curl_setopt_array($ch,$options);

Solution

  • The truth is that i don't understand what you want to achieve but i definitely know that that is not the way to get __VIEWSTATE and __EVENTVALIDATION

    it should be something like this

    $url = "http://www.riogrande.com/Category/Findings-and-Finished-Jewelry/132/Bails-and-Enhancers/472";
    $html = file_get_contents($url);
    
    preg_match('~<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="(.*?)" />~',$html,$viewstate);
    preg_match('~<input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="(.*?)" />~',$html,$eventvalidation);
    
    $viewstate = $viewstate[1];
    $eventvalidation = $eventvalidation[1] ;
    
    var_dump($viewstate,$eventvalidation);