php dom domdocument file-get-contents domxpath

Extract html content using php

I have the following code:

$html = file_get_contents("http://www.jabong.com/giordano-Dtlm60058-Black-Analog-Watch-267058.html");

$dom = new DOMDocument();


$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//*[@id="price_div"]/div[2]/span[2]');  //this catches all elements with 
var_dump($nodes);

I want to extract the price from the page. But this xpath is not giving me the result.

Solution

Did you ever solve the problem? Here is some working code :

$html = file_get_contents("http://www.jabong.com/giordano-Dtlm60058-Black-Analog-Watch-267058.html");

//suppress errors (there is a lot on the page in question)
libxml_use_internal_errors(true);

//dont preserve whitespaces
$page->preserveWhiteSpace = false;

$dom = new DOMDocument();
//as @Larry.Z comments, you forgot to load the $html
$dom->loadHTML($html);

$xpath = new DOMXPath($dom);

//assuming there can be more than one "price set" on each page
$prices = array();

$price_divs = $xpath->query('//div[@id="price_div"]');
foreach ($price_divs as $price_div) {
    $price=array();
    foreach ($price_div->childNodes as $price_item) {
        $content=trim($price_item->textContent);
        if ($content!='') $price[]=$content;
    } 
    $prices[]=$price;
}

echo '<pre>';
print_r($prices);
echo '</pre>';

outputs

Array
(
    [0] => Array
        (
            [0] => Save 66%
            [1] => Rs. 5850
            [2] => Rs. 1999
        )

)

you can skip the $prices[] part and only use $price if there never will be more than one price set per page.