Search code examples
phparraysparsingforeachsimple-html-dom

Parse urls, loop file_get_html(urls) and then get element


I have a site which I need to parse it.

First, I have to parse all catalog's urls in the page, then I need to enter all urls then go through all the urls and parse urls on each page again, then go through all urls and get element ('.description div').

I'm using simple html dom.

But I have one problem in moment when I want to go through all urls which I parse for the first time. I'm getting empty page

include 'simple_html_dom.php';
$catalogs = file_get_html('http://optnow.ru/catalog');
$catalogLink = [];
if(!empty($catalogs)) {
    foreach( $catalogs->find('div.cat-name a') as $catalog) {
         $catalogUrl = 'http://optnow.ru/' . $catalog->href . '?page=0';
         $catalogLink[] = $catalogUrl;
         $catalogHtml = file_get_html($catalogUrl);
         $productsLink = $catalogHtml->find('.link-pv-name');
         print_r($productsLink->href);
    }
}

Where is my mistake?

Thanks.


Solution

  • You need to pass array, not a single element in foreach:

    include 'simple_html_dom.php';
    $catalog = file_get_html('http://optnow.ru/catalog');
    $catalogLink = [];
    if(!empty($catalog)) {
        foreach( $catalog->find('div.cat-name a') as $catalogHref) {
             $myLink = 'http://optnow.ru/' . $catalogHref->href . '?page=0';
             $catalogLink[] = $myLink;
             echo '<pre>';
             print_r($myLink);
             echo '</pre>';
        }
        foreach ($catalogLink as $catalogSingleLink ) {
             if(!empty($catalogSingleLink)) {
                 $catalogHtml = file_get_html($catalogSingleLink);
                 $catalogProduct = $catalogHtml->find('.link-pv-name');
                 echo $catalogProduct->href;
             }
        }
    }