I have a site which I need to parse it.
First, I have to parse all catalog's urls in the page, then I need to enter all urls then go through all the urls and parse urls on each page again, then go through all urls and get element ('.description div').
I'm using simple html dom.
But I have one problem in moment when I want to go through all urls which I parse for the first time. I'm getting empty page
include 'simple_html_dom.php';
$catalogs = file_get_html('http://optnow.ru/catalog');
$catalogLink = [];
if(!empty($catalogs)) {
foreach( $catalogs->find('div.cat-name a') as $catalog) {
$catalogUrl = 'http://optnow.ru/' . $catalog->href . '?page=0';
$catalogLink[] = $catalogUrl;
$catalogHtml = file_get_html($catalogUrl);
$productsLink = $catalogHtml->find('.link-pv-name');
print_r($productsLink->href);
}
}
Where is my mistake?
Thanks.
You need to pass array, not a single element in foreach:
include 'simple_html_dom.php';
$catalog = file_get_html('http://optnow.ru/catalog');
$catalogLink = [];
if(!empty($catalog)) {
foreach( $catalog->find('div.cat-name a') as $catalogHref) {
$myLink = 'http://optnow.ru/' . $catalogHref->href . '?page=0';
$catalogLink[] = $myLink;
echo '<pre>';
print_r($myLink);
echo '</pre>';
}
foreach ($catalogLink as $catalogSingleLink ) {
if(!empty($catalogSingleLink)) {
$catalogHtml = file_get_html($catalogSingleLink);
$catalogProduct = $catalogHtml->find('.link-pv-name');
echo $catalogProduct->href;
}
}
}