In a typical HTML as
<ol>
<li>
<span>parent</span>
<ul>
<li><span>nested 1</span></li>
<li><span>nested 2</span></li>
</ul>
</li>
</ol>
I try to get the contents of <li>
elements but I need to get the parent
and those nested under ul
separately.
If go as
$ols = $doc->getElementsByTagName('ol');
foreach($ols as $ol){
$lis = $ol->getElementsByTagName('li');
// here I need li immediately under <ol>
}
$lis
is all li
elements including both parent and nested ones.
How can I get li
elements one level under ol
by ignoring deeper levels?
There are two approaches to this, the first is how you are working with getElementsByTagName()
, the idea would be just to pick out the first <li>
tag and assume that it is the correct one...
$ols = $doc->getElementsByTagName('ol');
foreach($ols as $ol){
$lis = $ol->getElementsByTagName('li')[0];
echo $doc->saveHTML($lis).PHP_EOL;
}
This echoes...
<li>
<span>parent</span>
<ul>
<li><span>nested 1</span></li>
<li><span>nested 2</span></li>
</ul>
</li>
which should work - BUT is not exact enough at times.
The other method would be to use XPath, where you can specify the levels of the document tags you want to retrieve. This uses //ol/li
, which is any <ol>
tag with an immediate descendant <li>
tag.
$xp = new DOMXPath($doc);
$lis = $xp->query("//ol/li");
foreach ( $lis as $li ) {
echo $doc->saveHTML($li);
}
this also gives...
<li>
<span>parent</span>
<ul>
<li><span>nested 1</span></li>
<li><span>nested 2</span></li>
</ul>
</li>