I have executed the following code on the the sample XML at the bottom of the question and I am getting unexpected results.
$xml = simplexml_load_string($xml_string);
$addresses = $xml->response->addressinformation;
var_dump($addresses->xpath('//record'));
I would expect this to return only the two record
nodes that are children of the current addresses
node. But, it actually returns all 5 of the record
nodes of the original $xml
element. Everything I have read says that the //
notation is relative to the current node. I realize that there are other ways to get to just the two records I've referenced in the questions. $addresses->xpath('records/record');
is just one example. But, the strange behavior is part of a larger problem I'm having and I just need to understand why it is behaving this way. Everything I've read would lead me to believe otherwise. Can anyone help me understand?
Sample XML
$xml_string = '
<?xml version="1.0" encoding="utf-8"?>
<root>
<response>
<addressinformation>
<records>
<record id="1">
<fullname>JOHN E DOE</fullname>
<firstname>JOHN</firstname>
<middlename>E</middlename>
<lastname>DOE</lastname>
<fulldob>01/01/1970</fulldob>
</record>
<record id="2">
<fullname>JOHN E DOE</fullname>
<firstname>JOHN</firstname>
</record>
</records>
</addressinformation>
<otherinformation>
<records>
<record id="3">
<fullname>JOHN DOE</fullname>
<firstname>JOHN</firstname>
<lastname>DOE</lastname>
<fulldob>01/01/1970</fulldob>
</record>
<record id="4">
<fullname>JOHN EDWARD DOE</fullname>
<firstname>JOHN</firstname>
<middlename>EDWARD</middlename>
<lastname>DOE</lastname>
<fulldob>19700000</fulldob>
</record>
<record id="5">
<fullname>JOHN EDWARD DOE</fullname>
<firstname>JOHN</firstname>
<middlename>EDWARD</middlename>
<lastname>DOE</lastname>
<fulldob>19830000</fulldob>
</record>
</records>
</otherinformation>
</response>
</root>
';
According to https://www.w3.org/TR/1999/REC-xpath-19991116/:
//para
selects all the para descendants of the document root and thus selects all para elements in the same document as the context node
and
.//para
selects the para element descendants of the context node
Note the dot before the latter one. This also works in your case:
var_dump($addresses->xpath('.//record'));
properly only shows the two nodes you are expecting.
The thing is - apparently - that even all the objects are just of type SimpleXMLElement
, the first one that you are creating by calling simplexml_load_string()
is for some reason considered the document root. When you "destruct" your document into nodes and subnodes this all makes sense to me.
However I would agree that this is at least some behaviour that is not documented in the PHP docs, so I recommend you suggest an edit there.