Search code examples
phpxmlxpathsimplexml

PHP - SimpleXMLElement not parsing correctly with namespaces


This is returned by API:

<?xml version='1.0' encoding='utf-8'?>
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xml:base="https://exmple.com/odata/">
    <id>https://example.com/odata/PicklistOption(989L)</id>
    <title type="text" />
    <updated>2015-09-03T11:56:51Z</updated>
    <author>
        <name />
    </author>
    <link rel="edit" title="PicklistOption" href="PicklistOption(989L)" />
    <link rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/childPicklistOptions" type="application/atom+xml;type=feed" title="childPicklistOptions" href="PicklistOption(989L)/childPicklistOptions" />
    <link rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/parentPicklistOption" type="application/atom+xml;type=entry" title="parentPicklistOption" href="PicklistOption(989L)/parentPicklistOption" />
    <link rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/picklistLabels" type="application/atom+xml;type=feed" title="picklistLabels" href="PicklistOption(989L)/picklistLabels" />
    <link rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/picklist" type="application/atom+xml;type=entry" title="picklist" href="PicklistOption(989L)/picklist" />
    <category term="SFOData.PicklistOption" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" />
    <content type="application/xml">
        <m:properties>
            <d:id m:type="Edm.Int64">989</d:id>
            <d:status>ACTIVE</d:status>
            <d:sortOrder m:type="Edm.Int32">229</d:sortOrder>
            <d:minValue m:type="Edm.Double">-1</d:minValue>
            <d:externalCode>PL</d:externalCode>
            <d:optionValue m:type="Edm.Double">-1</d:optionValue>
            <d:maxValue m:type="Edm.Double">-1</d:maxValue>
        </m:properties>
    </content>
</entry>

Now trying to get <d:id>

$xml = new SimpleXMLElement($xmlstr);
$namespaces = $xml->getNameSpaces(true);
$xml->registerXPathNamespace('m', $namespaces['m']);
$xml->registerXPathNamespace('d', $namespaces['d']);

$id = $xml->xpath('/entry/content/m:properties/d:id');
var_dump($id);

But I get array(0).


Solution

  • Do not fetch the namespaces from the document. Define them in you application. The namespaces are the values of the xmlns/xmlns:* attributes. The xmlns attribute is a default namespace. So the tag entry is actually {http://www.w3.org/2005/Atom}:entry.

    Namespaces have to be unique. To avoid conflicts most people use URLs. (It is not likely that other people will use your domains to define their namespaces.) The downside of this that the namespace are large strings with special characters. This is solved by using the namespaces prefixes as aliases.

    Xpath does not have a default namespace. You need to register a prefix for each namespace you like to use. The Xpath engine will resolve the prefix to the actual namespace and compare it with the resolved namespace of the node.

    $xml = new SimpleXMLElement($xmlstr);
    $namespaces = [
      'a' => 'http://www.w3.org/2005/Atom',
      'm' => 'http://schemas.microsoft.com/ado/2007/08/dataservices/metadata',
      'd' => 'http://schemas.microsoft.com/ado/2007/08/dataservices',
      'o' => 'https://exmple.com/odata/'
    ];
    foreach ($namespaces as $prefix => $namespace) {
      $xml->registerXPathNamespace($prefix, $namespace);
    }
    
    $id = $xml->xpath('/a:entry/a:content/m:properties/d:id');
    var_dump($id);
    

    Output:

    array(1) {
      [0]=>
      object(SimpleXMLElement)#2 (0) {
      }
    }
    

    You will have to register the Xpath namespaces on each SimpleXMLElement again.

    This is more convenient in DOM. DOMXpath::evaluate() executes Xpath expressions and can return node lists or scalars, depending on the expression.

    $document = new DOMDocument($xmlstr);
    $document->loadXml($xmlstr);
    $xpath = new DOMXpath($document);
    $namespaces = [
      'a' => 'http://www.w3.org/2005/Atom',
      'm' => 'http://schemas.microsoft.com/ado/2007/08/dataservices/metadata',
      'd' => 'http://schemas.microsoft.com/ado/2007/08/dataservices',
      'o' => 'https://exmple.com/odata/'
    ];
    foreach ($namespaces as $prefix => $namespace) {
      $xpath->registerNamespace($prefix, $namespace);
    }
    
    $id = $xpath->evaluate('string(/a:entry/a:content/m:properties/d:id)');
    var_dump($id);
    

    Output:

    string(3) "989"