I am scraping a certain website for specific links, which I am saving to my $url_results array. But want to exclude adding the link to the array if the li cluster, with the class of list-items__item, includes a child->child->child span with a class of list-items__item__notice.
Cluster I am scraping:
<li>
<a href="" data-lpurl=""> <!--The href I am scraping-->
<span class="list-items__item__position"></span>
<div class="list-items__item__title">
<span class="list-items__item__notice"> <!--I don't want to add to my array if this span is present-->
</span>
</div>
</a>
</li>
My PHP scraping function:
$items = $html->find('li[class=list-items__item]');
foreach($items as $post) {
$url_results[] = $url . ($post->children(0)->href);
}
I am using Simple HTML DOM and cURL to scrape.
I solved the problem by adding an if-sentence, checking whether the tag was empty and if so, add the href
to my array, if not, do nothing, as below:
foreach($items as $post) {
if (empty($post->children(0)->children(1)->children(0)->plaintext)) {
$url_results[] = $url . ($post->children(0)->href);
}
else {}
}