Search code examples

Symfony's DomCrawler does not find a specific tag

I'm using DomCrawler to get data from a Google Play page and it works in 99% of cases, except I stumbled upon a page where it can not find a specific div. I check the HTML code and it is definitely there. My code is

$autoloader = require __DIR__.'\vendor\autoload.php';
use Symfony\Component\DomCrawler\Crawler;

$app_id = 'com.balintinfotech.sinhalesekeyboardfree';

$response = file_get_contents(''.$app_id);
$crawler = new Crawler($response);
echo $crawler->filter('div[itemprop="datePublished"]')->text();

When I run that specific page I get

PHP Fatal error: Uncaught InvalidArgumentException: The current node list is empty.

However, if I use any other ID, I get the desired result. What exactly is about that page that breaks DomCrawler


  • As you correctly figured out, this doesn't happen in the English version, but it does in the Spanish one.

    One difference I could spot was a comment by a user saying නියමයි ඈ. There seems to be something bothering the Crawler there. If you replace a null characted (\x00) by an empty string, it correctly gets what you're looking for:

    $app_id = 'com.balintinfotech.sinhalesekeyboardfree';
    $response = file_get_contents(''.$app_id);
    $response = str_replace("\x00", "", $response);
    $crawler = new Symfony\Component\DomCrawler\Crawler($response);
    var_dump($crawler->filter('div[itemprop="datePublished"]')->text()); // string(14) "March 14, 2017"

    I'll try to look more into this.