Search code examples
phphtml-parsingdomdocument

Extracting a specific row of a table by DOMDocument


how can I extract information from a HTML file by using DOMDocument in PHP

my HTML page has a source with this part inside

this is my third table in the page that I need to work on:

 <table>
 <tbody>
 <tr>
   <td>A</td>
   <td>B</td>
   <td>C</td>
   <td>D</td>
</tr>
<tr>
  <td>1</td>
  <td>2</td>
  <td>3</td>
  <td>4</td>
</tr>
</tbody>
</table>

If my use ask me for showing row with B and D how should I extract the first row of this table and print it by using DOMDocument?


Solution

  • This would do it, it simply grabs the third table, loops over the rows and checks for B and D in the second and fourth columns. If found, it prints out each column value then stops looping.

    $dom = new DOMDocument();
    $dom->loadHTML(.....);
    
    // get the third table
    $thirdTable = $dom->getElementsByTagName('table')->item(2);
    
    // iterate over each row in the table
    foreach($thirdTable->getElementsByTagName('tr') as $tr)
    {
        $tds = $tr->getElementsByTagName('td'); // get the columns in this row
        if($tds->length >= 4)
        {
            // check if B and D are found in column 2 and 4
            if(trim($tds->item(1)->nodeValue) == 'B' && trim($tds->item(3)->nodeValue) == 'D')
            {
                // found B and D in the second and fourth columns
                // echo out each column value
                echo $tds->item(0)->nodeValue; // A
                echo $tds->item(1)->nodeValue; // B
                echo $tds->item(2)->nodeValue; // C
                echo $tds->item(3)->nodeValue; // D
                break; // don't check any further rows
            }
        }
    }