I am trying to scrape kickasstorrents with simple html dom, but I am getting an error and I haven't even started yet. I followed some simple html tutorials and I have set up my url and using curl.
Code is as follows:
<?php
require('inc/config.php');
include_once('inc/simple_html_dom.php');
function scrap_kat() {
// initialize curl
$html = 'http://katcr.to/new/';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $html);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
$ip=rand(0,255).'.'.rand(0,255).'.'.rand(0,255).'.'.rand(0,255);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("REMOTE_ADDR: $ip", "HTTP_X_FORWARDED_FOR: $ip"));
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/".rand(3,5).".".rand(0,3)." (Windows NT ".rand(3,5).".".rand(0,2)."; rv:2.0.1) Gecko/20100101 Firefox/".rand(3,5).".0.1");
$html2 = curl_exec($ch);
if($html2 === false)
{
echo 'Curl error: ' . curl_error($ch);
}
else
{
// create HTML DOM
$kat = file_get_contents($html);
}
curl_close($ch);
// scripting starts
// clean up memory
$kat->clear();
unset($kat);
// return information
return $ret;
}
$ret = scrap_kat();
echo $ret;
?>
I receive the errors
Fatal error: Call to a member function clear() on resource in C:\wamp64\www\index.php on line 36
What do I do wrong? Thanks.
Simple_html_dom is a class. In that class there may be a function call, clear or it is in Simple_html_dom_node class. But In simple html dom, you need to use simple_html_dom class.
@Hassaan, is correct. file_get_contents is a native php function, you have to create an object of simple_html_dom class. Like,
$html = new simple_html_dom();
And use this below code.
function scrap_kat() {
$url = 'http://katcr.to/new/';
// $timeout= 120;
# create object
$html = new simple_html_dom();
#### CURL BLOCK ####
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/".rand(3,5).".".rand(0,3)." (Windows NT ".rand(3,5).".".rand(0,2)."; rv:2.0.1) Gecko/20100101 Firefox/".rand(3,5).".0.1");
//curl_setopt($curl, CURLOPT_TIMEOUT, $timeout);
$ip=rand(0,255).'.'.rand(0,255).'.'.rand(0,255).'.'.rand(0,255);
curl_setopt($curl, CURLOPT_HTTPHEADER, array("REMOTE_ADDR: $ip", "HTTP_X_FORWARDED_FOR: $ip"));
$content = curl_exec($curl);
curl_close($curl);
# note the variable change.
# load the curl string into the object.
$html->load($content);
//echo $ip;
#### END CURL BLOCK ####
print_r($html->find('a'));
// clean up memory
$html->clear();
unset($html);
}
scrap_kat();
Well, their are a lot of errors in your code, so I am just telling you how you can do this. If explanation needed, please comment below this answer. I will.