I have a DOMDocument in PHP and I'm trying to delete all nodes except of a container with a specific ID.
Lets say I have the following DOM Document:
<section>
<div id="first-section">
<ul>
<li>Test</li>
<li>Test</li>
</ul>
</div>
<div id="second-section">
<ul>
<li>Test</li>
<li>Test</li>
</ul>
<div id="sub-section">
<h2>Hello World</h2>
</div>
</div>
<div id="third-section">
<ul>
<li>Test</li>
<li>Test</li>
</ul>
</div>
</section>
My PHP Code:
$domDocument = $this->domParser->loadHTML($markup);
$xpath = new \DOMXPath($domDocument);
$nlist = $xpath->query("//*[@id='sub-section']");
$domDocument->saveHTML();
With this code I query the correct container. But how could I remove all nodes except this node from my document, so that in the end I have the following nodes:
<div id="sub-section">
<h2>Hello World</h2>
</div>
What I tried
I tried to go the reversed way with a query like this: "/*/*[not(@id='test')]"
But it works not fine for nested HTML structures. Sometimes, depending on the structure, it removes all nodes.
Whats the way to go here?
That logic is strange. How do you know then what to keep? What in a nested case?
I would pick the ones I need and copy to a new document.
$xml = <<<'_XML'
<section>
<div id="first-section">
<ul>
<li>Test</li>
<li>Test</li>
</ul>
</div>
<div id="second-section">
<ul>
<li>Test</li>
<li>Test</li>
</ul>
<div id="sub-section">
<h2>Hello World</h2>
</div>
</div>
<div id="third-section">
<ul>
<li>Test</li>
<li>Test</li>
</ul>
</div>
</section>
_XML;
libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($xml);
$newDoc = new DOMDocument();
$newDoc->appendChild($newDoc->importNode($doc->getElementById('sub-section'), true));
echo $newDoc->saveHTML();
When you only need just one node, you can even easier go with
libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($xml);
echo $doc->saveHTML($doc->getElementById('sub-section'));
The same output with both examples.
<div id="sub-section">
<h2>Hello World</h2>
</div>