I parsing a large XML document and am having a lt of trouble when it comes to parsing child nodes. Below is a sample of what i'm trying to parse.
<link rel="http://xxxxx/people.employees" title="employees">
<people>
<link href="/154" rel="http://catalog/person" title="Guy Nom" />
<link href="/385" rel="http://catalog/person" title="Carrie Jin" />
<link href="/162" rel="http://catalog/person" title="Joe Zee" />
<link href="/2125" rel="http://catalog/person" title="Mark Polin" />
<link href="/9293" rel="http://catalog/person" title="Stephen Castor" />
<link href="/21822" rel="http://catalog/person" title="Callum Tinge" />
<link href="/2022" rel="http://catalog/person" title="Brian Lennon" />
<link href="/2040" rel="http://catalog/person" title="Jorja Fox" />
<link href="/2046" rel="http://catalog/person" title="Harry Harris" />
<link href="/2399" rel="http://catalog/person" title="Sam Muellerleile" />
</people>
</link>
<link rel="http://xxxxx/people/others" title="others">
<people>
<link href="/7143" rel="http://catalog/person" title="James Smith" />
</people>
</link>
I need to differentiate between 'employees' and 'others' and store them in a separate fields. I want to do something like below:
if($xmlReader->localName == 'link') {
if ($xmlReader->getAttribute('title') == "employees"){
//GO TO NEXT LINK TAG AND GET NAME
$myObject->employees[$myObject->employees_count]['name'] = $xmlReader->getAttribute('title');
$myObject->employees_count++;
} else if ($xmlReader->getAttribute('title') == "others"){
//GO TO NEXT LINK TAG AND GET NAME
$myObject->others[$myObject->others_count]['name'] = $xmlReader->getAttribute('title');
$myObject->others_count++;
}
}
Obviously the bits that are commented above are the problem for me. I don't know how to read these child elements and, in my opinion, the PHP docs on this aren't great at all. I'd appreciate any help.
For XmlReader you can make use of the $depth
property. The <link>
element will like have 1
(one) so while you go on reading, you can check if the current element is still a child of that, because you will see a END_ELEMENT
with the same $depth
and then you know children are all consumed.
In an answer yesterday I showed how to encapsulate that logic by extending from XML_Reader
:
It allows to pass the depth of the parent element to a new method called readToNextChildElement($depth)
that will allow you to traverse child elements only.
Usage Example:
$depth = $reader->depth; # parent elements depth
while ($reader->readToNextChildElement($depth)) {
# only children
}
The implementation is:
class MyXMLReader extends XMLReader
{
...
public function readToNextChildElement($depth)
{
// if the current element is the parent and
// empty there are no children to go into
if ($this->depth == $depth && $this->isEmptyElement) {
return false;
}
while ($result = $this->read()) {
if ($this->depth <= $depth) return false;
if ($this->nodeType === self::ELEMENT) break;
}
return $result;
}
...
You can find the rest of the code in the linked answer. Depending on your needs this might be helpful - if you want this XML_Reader
based. Otherwise if you can load the whole document into memory instead, Xpath is much more easy to use to query your elements.
$employees_names = array_map(
'strval',
$sxml->xpath('//link[@title="employees"]//link/@title')
);
That was SimpleXML.