I'm struggling to scrape a full page of Aliexpress image. It gets all of the alt tags, and the first 8 images.
<?php
require 'vendor/autoload.php';
use Goutte\Client;
$url = "https://www.aliexpress.com/af/tie.html?SearchText=tie";
$client = new Client();
$crawler = $client->request('GET', $url);
$output = $crawler->filter('#hs-below-list-items li div div.img.img-border div a img')->each(function ($node) {
echo '<img src="' . $node->attr('src') . '" alt="' . $node->attr('alt') . '">';
});
var_dump($output);
Is this something todo with AliExpress Lazy Loading in the images possible?
Would I need to use something like a headless browser? If so can you please point me in the right direction.
Any help would be greatly appreciated.
Thanks, Jake.
You need to filter for the data attribute itself.
$output = $crawler->filter('img.picCore[image-src]')->each(function ($node) {
echo '<img src="' . $node->attr('image-src') . '" alt="' . $node->attr('alt') . '">';
});
JH