I am new on DOMXPath but I am trying to learn more. Currently I have a HTML structure like this:
<span class="1">
<div class="headerClass">
Here you have <span class="spanClass1">some text</span>. And here there is <span class="spanClass2">even more text</span>
</div>
<table class="tableClass" id="tableID">
<tr>
<td>some text</td>
<td>some text</td>
<td>some text</td>
</tr>
<tr>
<td>some text</td>
<td>some text</td>
<td><a href="http://www.website1.com" target="_blank">My Link</a></td>
</tr>
<tr>
<td>some text</td>
<td>some text</td>
<td><a href="http://www.website2.com" target="_blank">My Link</a></td>
</tr>
</table>
</span>
<span class="2">
<div class="headerClass">
Here you have <span class="spanClass1">some text</span>. And here there is <span class="spanClass2">even more text</span>
</div>
<table class="tableClass" id="tableID">
<tr>
<td>some text</td>
<td>some text</td>
<td>some text</td>
</tr>
<tr>
<td>some text</td>
<td>some text</td>
<td><a href="http://www.website1.com" target="_blank">My Link</a></td>
</tr>
<tr>
<td>some text</td>
<td>some text</td>
<td><a href="http://www.website2.com" target="_blank">My Link</a></td>
</tr>
</table>
</span>
... and the spans continue: 3, 4, 5 ... etc
To retrieve this HTML code from the source file, I am using this:
$oDomXpath = new DOMXpath($oDom);
$query = "//span[number(@class)=number(@class)]";
$oDomObject = $oDomXpath->query($query);
foreach ($oDomObject as $oObject) {
// WHAT GOES HERE????
}
I need to store in an array the following values:
<div class="headerClass">
without the html tags.<span class="spanClass2">
How can I accomplish this? What would I have to put inside the foreach loop? Do I need to necessarily run another query??
Thank you very much in advance for your help!
You have the choice, you can use several XPath queries and obtain values one by one, or you can build an unique XPath query with several paths:
<pre><?php
$dom = new DOMDocument();
@$dom->loadHTMLFile('yourfile.html');
$xpath = new DOMXPath($dom);
$xquery = <<<'EOD'
//span[number(@class)=@class]/@class |
//span[number(@class)=@class]/div[@class='headerClass'] |
//span[number(@class)=@class]/div[@class='headerClass']/span[@class='spanClass2'] |
//span[number(@class)=@class]/table[@class='tableClass']/tr/td/a
EOD;
$nodes = $xpath->query($xquery);
foreach ($nodes as $node) {
if ($node->nodeType == XML_ELEMENT_NODE)
switch($node->nodeName):
case 'div' : echo '<br/>div content: ' . $node->nodeValue; break;
case 'span': echo '<br/>span content: ' . $node->nodeValue; break;
default : echo '<br/>url: ' . $node->getAttribute('href');
endswitch;
else
echo '<br/><br/>number: ' . $node->value;
}