I need a little help, with getting content from external webpages.
I need to get a div
, and then delete another div
from inside it. This is my code, can someone help me?
This is the relevant portion of my XML code:
<html>
...
<body class="domain-4 page-product-detail" > ...
<div id="informacio" class="htab-fragment"> <!-- must select this -->
<h2 class="description-heading htab-name">Utazás leírása</h2>
<div class="htab-mobile tab-content">
<p class="tab-annot">* Hivatalos ismertető</p>
<div id="trip-detail-question"> <!-- must delete this -->
<form> ...</form>
</div>
<h3>USP</h3><p>Nagy, jól szervezett és családbarát ...</p>
<div class="message warning-message">
<p>Az árak már minden aktuális kedvezményt tartalmaznak!</p>
<span class="ico"></span>
</div>
</div>
</div>
...
</body>
</html>
I need to get the div
with id="informacio"
, and after that I need to delete the div
id="trip-detail-question"
from it including the form it contains.
This is my code, but its not working correctly :(.
function get_content($url){
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->strictErrorChecking = false;
$doc->recover = true;
$doc->loadHTMLFile($url);
$xpath = new DOMXPath($doc);
$query = "//div[@id='informacio']";
$entries = $xpath->query($query)->item(0);
foreach($xpath->query("div[@id='trip-detail-question']", $entries) as $node)
$node->parentNode->removeChild($node);
$var = $doc->saveXML($entries);
return $var;
}
Your second XPath expression is incorrect. It tries to select a div
in the context of the div
you selected previously as its child node. You are trying to select:
//div[@id='informacio']/div[@id='trip-detail-question']
and that node does not exist. You want this node:
//div[@id='informacio']/div/div[@id='trip-detail-question']
which you can also select like this (allowing any element, not just div
):
//div[@id='informacio']/*/div[@id='trip-detail-question']
or (allowing more than one nesting levels)
//div[@id='informacio']//div[@id='trip-detail-question']
In the context of the first div
, the correct XPath expression would be:
.//div[@id='trip-detail-question']
If you change it in your code, it should work:
foreach($xpath->query(".//div[@id='trip-detail-question']", $entries) as $node)
$node->parentNode->removeChild($node);