Suppose that i have this HTML from a source (scrapping it) :
<tr class="calendar_row" data-eventid="41675">
<td class="alt2 eventDate smallfont" align="center"/>
<td class="alt2 smallfont" align="center">9:00pm</td>
<td class="alt2 smallfont" align="center">AUD</td>
<td class="alt2 icon smallfont" align="center">
<div class="cal_imp_medium" title="Medium Impact Expected"/>
</td>
<td class="alt2 eventHigh smallfont" align="center">
<div class="calendar_detail level_1" data-level="1" title="Open Detail"/>
</td>
//I want to get this part below correctly
<td class="alt2 pad_left eventHigh smallfont" align="center">0.2%</td>
<td class="alt2 pad_left eventHigh smallfont" align="center"/>
<td class="alt2 pad_left eventHigh smallfont" align="center">
<span class="revised worse" title="Revised From -0.3%">-0.4%</span>
</td>
</tr>
And I want to get the value (nodeValues) of the td's through XPath :
$query = $xpath->query('//tr[@data-eventid="41675"]/td[@class="alt2 pad_left eventHigh smallfont"]');
I cant figure it out why im only getting the value -0.4%. Though the html seems to be complicated and regradless of how it is being formatted, is there any possible way (query) to retrieve the values in between tags including the null ones on the second td?
Full Code
libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$query_results = $xpath->query('//tr[@data-eventid="'.$data_eventid.'"]/td[@class="alt2 pad_left eventHigh smallfont"]');
foreach($query_results as $values){
if($values->nodeValue!=' ' and $values->nodeValue!='' and $values->nodeName!='#text') { //Discards Empty Arrays
$table_values[$data_eventid][5] = $values->nodeValue;
}
}
Try this: //tr[@data-eventid="41675"]/td[@class="alt2 pad_left eventHigh smallfont"]/descendant-or-self::*/text()
Well you probably just want the nodes, so take the /text()
off:
//tr[@data-eventid="41675"]/td[@class="alt2 pad_left eventHigh smallfont"]/descendant-or-self::*