Search code examples
phphtmlweb-scrapingsimple-html-dom

PHP Simple HTML DOM - How to find the table by a particular value in TD?


My table looks like,

<table width="100%" border="0" cellpadding="2" cellspacing="0">
<tr>
<td><strong>NPA/Area Code:</strong></td>
<td><a href="/area-code/area-code-229.asp">229</a></td>
<td><strong>NXX Use Type:</strong></td>
<td>LANDLINE</td>
</tr>
<tr>
<td><strong>NXX/Prefix:</strong></td>
<td>428</td>
<td><strong>NXX Intro Version:</strong></td>
<td>2000-10-31</td>
</tr>
</table>

There are so many tables with no id or class, so finding the one I want is so hard. I am thinking about to use the text in the td to select the table. Is that possible? Because the site I want to scrape data from is coded to be this way. I am unsure how to manipulate the code with Simple HTML DOM, to select this table and then select the text within the td. I know how to extract the value inside td, so the question is how to select this particular table that I want. The link I want to scrape data is, scrape source

Any help is appreciated. Thanks.


Solution

  • I suggest you create a marker for that table, since your trying to get the table below AreaCode/Prefix 229-428 Details, then use that and then point to the next sibling which is that particular table that you want. Example:

    $html = file_get_html('http://www.area-codes.com/exchange/exchange.asp?npa=229&nxx=428');
    $table = null;
    $needle = 'AreaCode/Prefix 229-428 Details';
    foreach($html->find('h3') as $marker) {
        if($marker->innertext == $needle) {
            $table = $marker->next_sibling();
            break;
        }
    }
    
    $data = array();
    if($table) {
        foreach($table->children() as $k => $tr) {
            foreach($tr->children as $td) {
                $data[$k][] = $td->innertext;
            }
        }
    }
    
    echo '<pre>';
    print_r($data);