I have to parse an HTML structure like this:
<div class='container>
<div class='inner-div'>
<span class='text'>...</span>
<div class='author'>
<span data-author='Alpha'>...</span>
</div>
<div class='summary'>
<span data-summary='Exclusive'>Text 1</span>
</div>
</div>
<div class='inner-div'>
<span class='text'>...</span>
<div class='author'>
<span data-author='Beta'>...</span>
</div>
<div class='summary'>
<span data-summary='Non-Exclusive'>Text 2</span>
</div>
</div>
<div class='inner-div'>
<span class='text'>...</span>
<div class='author'>
<span data-author='Gamma'>...</span>
</div>
<div class='summary'>
<span data-summary='Exclusive'>Text 3</span>
</div>
</div>
<div class='inner-div'>
<span class='text'>...</span>
<div class='author'>
<span data-author='Delta'>...</span>
</div>
<div class='summary'>
<span data-summary='Non-Exclusive'>Text 4</span>
</div>
</div>
...
<div class='inner-div'>
<span class='text'>...</span>
<div class='author'>
<span data-author='Zeta'>...</span>
</div>
<div class='summary'>
<span data-summary='Exclusive'>Text 5</span>
</div>
</div>
</div>
I wish to obtain the first 'Exclusive' summary where author is not 'Alpha'. In the above example it would be 'Text 3'. How can I parse this using Simple HTML DOM or even XML DOM?
ADDENDUM: I am looking for parsing the HTML using PHP Simple HTML Dom library. I know how to parse it in jQuery, but Simple HTML Dom library doesn't seem to support any equivalent for (:has).
No, but here's a simple html dom replacement that does (you want :not
instead of :has
btw):
include_once('advanced_html_dom.php');
$html = str_get_html($str);
echo $html->find('.author:not(> [data-author=Alpha]) ~ .summary > [data-summary=Exclusive]', 0);