I need to count how many of these items are open, and there are four types of them: Easy, Medium, Difficult and Not-Wanted. All of these types are values inside the div's. I need to exclude the 'Not-Wanted' types from the count. Notice the 'Open' and 'Close' values have different number of spaces around them. This is the html structure:
<table>
<tbody>
<tr>
<td>
<div>Difficult</div>
</td>
<td>Name</td>
<td> Open </td>
</tr>
<tr>
<td>
<div>Easy</div>
</td>
<td>Name</td>
<td> Closed </td>
</tr>
<tr>
<td>
<div>Easy</div>
</td>
<td>Name</td>
<td> Open </td>
</tr>
<tr>
<td>
<div>Medium</div>
</td>
<td>Name</td>
<td>Open </td>
</tr>
<tr>
<td>
<div>Easy</div>
</td>
<td>Name</td>
<td> Open </td>
</tr>
<tr>
<td>
<div>Medium</div>
</td>
<td>Name</td>
<td> Closed</td>
</tr>
<tr>
<td>
<div>Easy</div>
</td>
<td>Name</td>
<td>Closed </td>
</tr>
<tr>
<td>
<div>Not-wanted</div>
</td>
<td>Name</td>
<td> Open </td>
</tr>
<tr>
<td>
<div>Difficult</div>
</td>
<td>Name</td>
<td>Open</td>
</tr>
............
This is one of my attempts to solve the problem. It is obviously wrong, but I don't know how to get it right.
$doc = new DOMDocument();
$doc->loadHtmlFile('http://www.nameofsite.com');
$doc->preserveWhiteSpace = false;
$xpath = new DOMXPath($doc);
$elements = $xpath->query("/html/body/div[1]/div/section/div/section/article/div/div[1]/div/div/div[2]/div[1]/div[2]/div/section/div/div/table/tbody/tr");
$count = 0;
foreach ($elements as $element) {
if ($element->childNodes->nodeValue != 'Not-wanted') {
if ($element->childNodes->nodeValue === 'open') {
$count++;
}
}
}
echo $count;
I have a very rudimental knowledge of DOMXPath, so it is too complex for me, since I'm only able to create simple queries.
Can anybody help?
Thanks in advance.
Based on the data in your example, I think you can adjust the xpath expression to this to get all the <tr>
's that match your conditions:
//table/tbody/tr[normalize-space(td[3]/text()) = 'Open' and td[1]/div/text() != 'Not-wanted']
$elements
is then of type DOMNodeList and you can then get the length
property to get the number of nodes in the list.
For example:
$source = <<<SOURCE
<table>
<tbody>
<tr>
<td>
<div>Difficult</div>
</td>
<td>Name</td>
<td> Open </td>
</tr>
<tr>
<td>
<div>Easy</div>
</td>
<td>Name</td>
<td> Closed </td>
</tr>
<tr>
<td>
<div>Easy</div>
</td>
<td>Name</td>
<td> Open </td>
</tr>
<tr>
<td>
<div>Medium</div>
</td>
<td>Name</td>
<td>Open </td>
</tr>
<tr>
<td>
<div>Easy</div>
</td>
<td>Name</td>
<td> Open </td>
</tr>
<tr>
<td>
<div>Medium</div>
</td>
<td>Name</td>
<td> Closed</td>
</tr>
<tr>
<td>
<div>Easy</div>
</td>
<td>Name</td>
<td>Closed </td>
</tr>
<tr>
<td>
<div>Not-wanted</div>
</td>
<td>Name</td>
<td> Open </td>
</tr>
<tr>
<td>
<div>Difficult</div>
</td>
<td>Name</td>
<td>Open</td>
</tr>
</tbody>
</table>
SOURCE;
$doc = new DOMDocument();
$doc->loadHTML($source);
$doc->preserveWhiteSpace = false;
$xpath = new DOMXPath($doc);
$elements = $xpath->query("//table/tbody/tr[normalize-space(td[3]/text()) = 'Open' and td[1]/div/text() != 'Not-wanted']");
echo $elements->length;
Which will result in:
5