Search code examples
phpdomdomdocument

Why can't I search for tags with Dom\HTMLDocument?


I'm trying to use the Dom\HtmlDocument that's new in PHP 8.4.

Let's say I just need to count all divs:

<?php

$html = <<<HTML
<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <title>Example</title>
</head>
<body>
    <div>Hello</div>
</body>
</html>
HTML;

$doc = Dom\HTMLDocument::createFromString($html);
$xpath = new Dom\XPath($doc);

// No divs found:
$divs = $xpath->query('//div');
echo $divs->count(); // 0

// 6 elements found, including the div:
$anyTags = $xpath->query('//*');
echo $anyTags->count(); // 6

As you can see, when I use * to grab any element, it works as expected and even the div is found.

Why can't I use tag selectors? I tried some fancier selectors with classnames etc. and it works properly, as long as I use * instead of specific tags.


Solution

  • By default, Dom\HTMLDocument::createFromString creates all nodes in a namespace http://www.w3.org/1999/xhtml. So if you want to query that via XPath, you need to make a name-space based query, like

    $xpath->registerNamespace('xhtml', 'http://www.w3.org/1999/xhtml');
    $divs = $xpath->query('//xhtml:div');
    

    If you want un-namespaced HTML (as in 99% of use cases), pass Dom\HTML_NO_DEFAULT_NS to Dom\HTMLDocument::createFromString:

    $doc = Dom\HTMLDocument::createFromString(
        $html, Dom\HTML_NO_DEFAULT_NS);
    $divs = $xpath->query('//div'); // returns 1 div