Search code examples
javascriptphphtmlweb-scrapinggoutte

scrape data- attribute using goutte?


How to scrape the data- attribute from a <a> link using goutte and laravel?

I want to scrape a tag like so:

<a class="ProfileNav-stat ProfileNav-stat--link u-borderUserColor u-textCenter js-tooltip js-nav u-textUserColor" data-nav="following" href="/rogerhamilton/following" data-original-title="987,358 Following">

within this <a> link I want to then scrape the data-original-title tag.

My code is:

$client = new Client();

//  Hackery to allow HTTPS
$guzzleclient = new \GuzzleHttp\Client([
    'timeout' => 60,
    'verify' => false,
]);

//  Hackery to allow HTTPS
$client->setClient($guzzleclient);
$crawler = $client->request('GET', 'url');


$elements = $crawler->filter('.ProfileNav-stat.ProfileNav-stat--link')->each(function($node){
    $x = $node->filter('data-original-title');
    dd($x);
});

but it doesn't return the correct data.


Solution

  • For anyone else that comes accross this issue. Its as simple as filtering down to the link and then doing something like $node->filter('.classname or #ID')->attr('data-original-title').