Search code examples
phpdomxpath

domxpath - Extract li tags from second ul


I am trying to extract only the second ul's li tags from the following. Unfortunately, there are no classes or ids in the html to help

<ul>
    <li>Some text</li>
    <li>Some text</li>
    <li>Some text</li>
</ul>

<ul>
    <li>Some more text</li>
    <li>Some more text</li>
    <li>Some more text</li>
</ul>

I have tried (a few things, actually):

    $ul = $xpath->query('//ul')->item(1);
    $query = '/li';
    $lis = $xpath->evaluate($query, $ul);

Thinking this will get me the second ul, and then I can extract from there. It does get me the second ul's html, but I'm obviously misunderstanding something with `->evaluate? because my li's are all the li's, not just from the second ul.


Solution

  • You can directly access them using XPath:

    $xpath->query('//ul[2]/li');
    

    Example:

    $html = <<<EOF
    <ul>
        <li>Some text</li>
        <li>Some text</li>
        <li>Some text</li>
    </ul>
    
    <ul>
        <li>Some more text</li>
        <li>Some more text</li>
        <li>Some more text</li>
    </ul>
    EOF;
    
    $doc = new DOMDocument();
    $doc->loadHTML($html);
    
    $selector = new DOMXpath($doc);
    
    // iterate through them...
    foreach($selector->query('//ul[2]/li') as $li) {
        echo $li->nodeValue . PHP_EOL;
    }
    

    ~