I intend to use PHP Simple HTML DOM To extract the links in this link
The code I wrote is as follows:
$url = "https://www.technolife.ir/product-3303";
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_REFERER, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$str = curl_exec($curl);
curl_close($curl);
$html_base = new simple_html_dom();
foreach($html_base->find('a') as $element) {
echo "<pre>";
print_r( $element->href );
echo "</pre>";
}
But unfortunately I get this error while running:
Call to a member function find() on null
https://www.technolife.ir/product-3303 serves gzip-compressed content even when the client doesn't request compression, hence you just get a bunch of binary gzip-compressed data which looks like complete junk to simplehtmldom and causes it to crash.
luckily libcurl has built-in support for decompressing gzip, which can be enabled with curl_setopt($curl, CURLOPT_ENCODING, '');
that said, you should use DOMDocument over simple_html_dom,
$html_base = new DOMDocument();
@$html_base->loadHTML($str);
foreach($html_base->getElementsByTagName('a') as $element) {
echo "<pre>";
print_r( $element->getAttribute("href") );
echo "</pre>";
}