Search code examples
phphtmlxpathdomdocument

Parse HTML text then loop through select options' text and value


I'm trying to associate (value->text) a group of options in a select element. My problem with the implementation below is that the selected value doesn't get associated with right text:

$html = '<select class="general class" 
    data-url="/foo/bar/">
    <option value=""></option>
    <option Selected value="Bar 1">Foo 1</option>
    <option  value="Bar 2">Foo 2</option>
    <option  value="Bar 3">Foo 3</option>
    <option  value="Bar 4">Foo 4</option>
    <option  value="Bar 5">Foo 5</option>
    <option  value="Bar 6">Foo 6</option>
    <option  value="Bar 7">Foo 7</option>
</select>';

$dom = new \DomDocument('1.0', 'UTF-8');
libxml_use_internal_errors(true);   
$dom->loadHTML($html);

$xp = new \DOMXpath($dom);
$opts_txt = $xp->query('//select[@data-url="/foo/bar/"]/option/text()');
$opts_vals = $xp->query('//select[@data-url="/foo/bar/"]/option/@value');

foreach ($opts_txt as $key => $opt) {
    echo $opt->nodeValue. "\n";
    echo $opts_vals->item($key)->nodeValue. "\n\n";
}

Output:

Foo 1


Foo 2
Bar 1

Foo 3
Bar 2

Foo 4
Bar 3

Foo 5
Bar 4

Foo 6
Bar 5

Foo 7
Bar 6

DEMONSTRATION

I know must be because the first value is empty, and I rather leave this clean and not do much logic for associate them the right way. I imagine there's another and more direct way.

Note: I can't grab the select by class because there are many selects with same class and not sure about their position in HTML.


Solution

  • It's hard to predict the behaviour of 2 separate XPath query. It's easier to loop the upper level in 1 XPath query, then access the attribute / text content with the loop.

    <?php
    
    $html = '<select class="general class" 
        data-url="/foo/bar/">
        <option value=""></option>
        <option Selected value="Bar 1">Foo 1</option>
        <option  value="Bar 2">Foo 2</option>
        <option  value="Bar 3">Foo 3</option>
        <option  value="Bar 4">Foo 4</option>
        <option  value="Bar 5">Foo 5</option>
        <option  value="Bar 6">Foo 6</option>
        <option  value="Bar 7">Foo 7</option>
    </select>';
    
    $dom = new \DomDocument('1.0', 'UTF-8');
    libxml_use_internal_errors(true);   
    $dom->loadHTML($html);
    
    $xp = new \DOMXpath($dom);
    
    $xp->query('//select[@data-url="/foo/bar/"]/option');
    foreach ($opts as $opt) {
      var_dump($opt->getAttribute('value'));
      var_dump($opt->textContent);
      echo "\n";
    }