when i scrape the table, the table tr and td values are changing. below is the orginal table.
<table class="scoretable">
<tbody>
<tr><td class="jdhead">Name</td><td class="fullhead">John</td></tr>
<tr><td class="jdhead">Age</td><td class="fullhead">30</td></tr>
<tr><td class="jdhead">Phone</td><td class="fullhead">91234988788</td></tr>
<tr><td class="jdhead">Location</td><td class="fullhead">Madrid</td></tr>
<tr><td class="jdhead">Country</td><td class="fullhead">Spain</td></tr>
<tr><td class="jdhead">Role</td><td class="fullhead">Manager</td></tr>
</tbody>
</table>
<table class="scoretable">
<tbody>
<tr><td class="jdhead">Name</td><td class="fullhead">John</td></tr>
<tr><td class="jdhead">Age</td><td class="fullhead">30</td></tr>
<tr><td class="jdhead">Phone</td><td class="fullhead">91234988788</td></tr>
<tr><td class="jdhead">Role</td><td class="fullhead">Manager</td></tr>
</tbody>
</table>
Above two tables are from different pages. I need to scrape Name, Phone and Role.
$url = "http://name.com/listings";
$html = file_get_html( $url );
$posts1 = $html->find('td[class=fullhead]',1);
foreach ( $posts1 as $post1 ) {
$poster1 = $post1->outertext;
echo $poster1;
}
I would try to preg_match
the needed values from the HTML like this:
<?php
$url = 'http://name.com/listings';
$html = file_get_contents($url);
if (preg_match('~<tr><td class="jdhead">Name</td><td class="fullhead">([^<]*)</td></tr>~', $html, $matches)) {
echo $matches[1]; // here is you name
}
if (preg_match('~<tr><td class="jdhead">Phone</td><td class="fullhead">([^<]*)</td></tr>~', $html, $matches)) {
echo $matches[1]; // here is you phone
}
if (preg_match('~<tr><td class="jdhead">Role</td><td class="fullhead">([^<]*)</td></tr>~', $html, $matches)) {
echo $matches[1]; // here is you role
}
Update (see comments below):
<?php
$url = 'http://jobsearch.naukri.com/job-listings-010915006292';
$html = file_get_contents($url);
if (preg_match('~<TR VALIGN="top"> <TD CLASS="jdHead">Job Posted </TD> <TD VALIGN="top" CLASS="detailJob">([^<]*)</TD> </TR>~', $html, $matches)) {
echo 'Job Posted: ' . $matches[1] . '<br><br>';
}
if (preg_match('~<TR VALIGN="top"> <TD CLASS="jdHead">Job Description</TD> <TD VALIGN="top" CLASS="detailJob">(.*?)</TD> </TR>~', $html, $matches)) {
echo 'Job Description: ' . $matches[1] . '<br><br>';
}