Search code examples
phpcurldomscreen-scraping

How to get the HTML of the specific div in php?


I'm a newbie to PHP scraping so please pardon me if I made any silly mistake. I'm trying to scrape jobs from indeed. I'm getting the HTML through curl and then parsing it using DOM. Now the problem is that I want to get a div with its class name but that is inside the table's row. I don't know what I'm doing wrong. Please help.

libxml_use_internal_errors(true);
$dom = new DOMDocument;
@$dom->loadHTML($result); 
$xpath = new DOMXpath($dom);
$obj = $dom->getElementById('resultsBody');
if (!empty($obj)) {
    $obj = $obj->getElementsByTagName('div[@class="jobsearch-SerpJobCard"]');
    $ob= $obj->item(0);
    print $ob->nodeValue;
    print_r($obj);
} else {
    echo "No data found";
}
exit;

Also can we get the specific div along with its HTML and save in a variable to further get div's inside it?


Solution

  • You can find from XPath.

    $classname="your-class";
    $nodes = $xpath->query("//*[contains(@class, '$classname')]");