Search code examples
phpxmlparsingobjectwsman

Loading non-standard xml wsman data into object with php


This question has been answered in many variations, but none of them refer to my situation.

I'm pulling data using WSMan, which then returns the output as a kind of sudo-xml. I wouldn't even consider it "real" xml, since it has so many non-standard attributes. The problem is that I need to be able to reference the output as an object within PHP. So at the moment I'm using a lot of str_replace. The problem with this is that if the non-standard format deviates (in some cases it will return something like this <KeyID xsi:nil="true"/> in other cases it might be something like this <CMCIP xsi:nil="true"/>), it is difficult to foresee all of the different attributes I'm going to have to account for and pull out of the variable before importing it as an object using simplexml_load_string.

So, my question in all simplicity : Is there a way to load non-standard XML into an object? Here is a sample of the xml data, so that you know what madness we're dealing with here.

<?xml version="1.0" encoding="UTF-8"?>
<s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/08/addressing" xmlns:wsen="http://schemas.xmlsoap.org/ws/2004/09/enumeration">
  <s:Header>
    <wsa:To>http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</wsa:To>
    <wsa:Action>http://schemas.xmlsoap.org/ws/2004/09/enumeration/EnumerateResponse</wsa:Action>
    <wsa:RelatesTo>uuid:3ae2d181-04f0-14f0-8002-89040b5d1500</wsa:RelatesTo>
    <wsa:MessageID>uuid:43a291ab-04f0-14f0-8073-b516f1d9bed4</wsa:MessageID>
  </s:Header>
  <s:Body>
    <wsen:EnumerateResponse>
      <wsen:EnumerationContext>439c90e9-04f0-14f0-8072-b516f1d9bed4</wsen:EnumerationContext>
    </wsen:EnumerateResponse>
  </s:Body>
</s:Envelope>
<?xml version="1.0" encoding="UTF-8"?>
<s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/08/addressing" xmlns:wsen="http://schemas.xmlsoap.org/ws/2004/09/enumeration" xmlns:n1="http://schemas.dell.com/wbem/wscim/1/cim-schema/2/DCIM_SystemView" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <s:Header>
    <wsa:To>http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</wsa:To>
    <wsa:Action>http://schemas.xmlsoap.org/ws/2004/09/enumeration/PullResponse</wsa:Action>
    <wsa:RelatesTo>uuid:3af0a1eb-04f0-14f0-8003-89040b5d1500</wsa:RelatesTo>
    <wsa:MessageID>uuid:43a41fe8-04f0-14f0-8074-b516f1d9bed4</wsa:MessageID>
  </s:Header>
  <s:Body>
    <wsen:PullResponse>
      <wsen:Items>
        <n1:DCIM_SystemView>
          <n1:AssetTag/>
          <n1:BIOSReleaseDate>11/20/2013</n1:BIOSReleaseDate>
          <n1:BIOSVersionString>2.1.3</n1:BIOSVersionString>
          <n1:BaseBoardChassisSlot>NA</n1:BaseBoardChassisSlot>
          <n1:BatteryRollupStatus>1</n1:BatteryRollupStatus>
          <n1:BladeGeometry>255</n1:BladeGeometry>
          <n1:BoardPartNumber>061P35A00</n1:BoardPartNumber>
          <n1:BoardSerialNumber>CN70163231007K</n1:BoardSerialNumber>
          <n1:CMCIP xsi:nil="true"/>
          <n1:CPLDVersion>1.0.3</n1:CPLDVersion>
          <n1:CPURollupStatus>1</n1:CPURollupStatus>
          <n1:ChassisModel/>
          <n1:ChassisName>Main System Chassis</n1:ChassisName>
          <n1:ChassisServiceTag>REMOVED</n1:ChassisServiceTag>
          <n1:ChassisSystemHeight>2</n1:ChassisSystemHeight>
          <n1:DeviceDescription>System</n1:DeviceDescription>
          <n1:ExpressServiceCode>33088672189</n1:ExpressServiceCode>
          <n1:FQDD>System.Embedded.1</n1:FQDD>
          <n1:FanRollupStatus>1</n1:FanRollupStatus>
          <n1:HostName/>
          <n1:InstanceID>System.Embedded.1</n1:InstanceID>
          <n1:LastSystemInventoryTime>20140928010936.000000+000</n1:LastSystemInventoryTime>
          <n1:LastUpdateTime>20140220171215.000000+000</n1:LastUpdateTime>
          <n1:LicensingRollupStatus>1</n1:LicensingRollupStatus>
          <n1:LifecycleControllerVersion>2.1.0</n1:LifecycleControllerVersion>
          <n1:Manufacturer>Dell Inc.</n1:Manufacturer>
          <n1:MaxCPUSockets>2</n1:MaxCPUSockets>
          <n1:MaxDIMMSlots>24</n1:MaxDIMMSlots>
          <n1:MaxPCIeSlots>6</n1:MaxPCIeSlots>
          <n1:MemoryOperationMode>OptimizerMode</n1:MemoryOperationMode>
          <n1:Model>PowerEdge R720xd</n1:Model>
          <n1:NodeID>F7852V1</n1:NodeID>
          <n1:PSRollupStatus>1</n1:PSRollupStatus>
          <n1:PlatformGUID>3156324f-c0c6-3580-3810-00374c4c4544</n1:PlatformGUID>
          <n1:PopulatedCPUSockets>2</n1:PopulatedCPUSockets>
          <n1:PopulatedDIMMSlots>8</n1:PopulatedDIMMSlots>
          <n1:PopulatedPCIeSlots>2</n1:PopulatedPCIeSlots>
          <n1:PowerCap>598</n1:PowerCap>
          <n1:PowerCapEnabledState>3</n1:PowerCapEnabledState>
          <n1:PowerState>2</n1:PowerState>
          <n1:PrimaryStatus>1</n1:PrimaryStatus>
          <n1:RollupStatus>1</n1:RollupStatus>
          <n1:ServerAllocation xsi:nil="true"/>
          <n1:ServiceTag>REMOVED</n1:ServiceTag>
          <n1:StorageRollupStatus>1</n1:StorageRollupStatus>
          <n1:SysMemErrorMethodology>6</n1:SysMemErrorMethodology>
          <n1:SysMemFailOverState>NotInUse</n1:SysMemFailOverState>
          <n1:SysMemLocation>3</n1:SysMemLocation>
          <n1:SysMemMaxCapacitySize>1572864</n1:SysMemMaxCapacitySize>
          <n1:SysMemPrimaryStatus>1</n1:SysMemPrimaryStatus>
          <n1:SysMemTotalSize>65536</n1:SysMemTotalSize>
          <n1:SystemGeneration>12G Monolithic</n1:SystemGeneration>
          <n1:SystemID>1320</n1:SystemID>
          <n1:SystemRevision>0</n1:SystemRevision>
          <n1:TempRollupStatus>1</n1:TempRollupStatus>
          <n1:UUID>4c4c4544-0037-3810-8035-c6c04f325631</n1:UUID>
          <n1:VoltRollupStatus>1</n1:VoltRollupStatus>
          <n1:smbiosGUID>44454c4c-3700-1038-8035-c6c04f325631</n1:smbiosGUID>
        </n1:DCIM_SystemView>
      </wsen:Items>
      <wsen:EndOfSequence/>
    </wsen:PullResponse>
  </s:Body>
</s:Envelope>

Solution

  • What you get back as response/output is a concatenation of multiple XML documents. In your example those are two.

    This isn't valid XML but it's also not uncommon.

    So all you need to do is to split the documents and pick the one you need want to deal with (in your example the second):

    $split = preg_split('~\Q<?xml version="1.0" encoding="UTF-8"?>\E\R~u', $sequenced_xml, 2, PREG_SPLIT_NO_EMPTY);
    $xml   = simplexml_load_string($split[1]);
    

    As you now have the XML document your're interested in, you can do what all the many other answers suggest on how to parse the SOAP Response. There is no malformed XML (which actually was a sequence of concatenated wellformed XML documents) any longer.

    The rest is dealing with the namespaces.

    Some pointers:

    ... to get the SOAP envelope body:

    $soap = 'http://www.w3.org/2003/05/soap-envelope';
    $body = $xml->children($soap)->Body;
    

    ... all enumerated items as array:

    $wsen = 'http://schemas.xmlsoap.org/ws/2004/09/enumeration';
    $xml->registerXPathNamespace('wsen', $wsen);
    $items = $body->xpath('.//wsen:*/*[not(namespace-uri(.) = namespace-uri(..))]');
    

    and so on and so forth.

    a note on your own code-example you've given in your answer: if you're looking for an element name in any namespace, you can do this in xpath with local-name():

    $pullitem = 'ServiceTag';
    $try      = $xml->xpath(sprintf("//*[local-name(.)='%s']", $pullitem));
    
    printf("pullitem '%s' has been foun in the following namespaces:\n", $pullitem);
    foreach ($try as $element) {
        $nsURI = dom_import_simplexml($element)->namespaceURI;
        printf(" - %s\n", $nsURI);
    }
    

    that does spare you your five or so individual xpath calls.

    and if you finally don't want to care about the namespace as you expect it to be "that one" of each item anyway, you can with the help of DOMDocument create a SimpleXMLElement for each result by extracting each into a new document of it's own:

    /**
     * create a new SimpleXMLElement out of an existing one
     *
     * @param SimpleXMLElement $item
     *
     * @return SimpleXMLElement
     */
    function simplexml_export_element(SimpleXMLElement $item) {
        $doc  = new DOMDocument();
        $node = $doc->importNode(dom_import_simplexml($item), true);
        $node = $doc->appendChild($node);
        return simplexml_load_string($doc->saveXML($doc->documentElement), get_class($item), 0, $node->namespaceURI);
    }
    

    Such a helper routine is helpful as it puts the element's own namespace as the namespace for the new SimpleXMLElement. This allows to directly access children in the same namespace. Code using it further on won't need to care about this "default" namespace specifically.

    Example:

    $split = preg_split('~\Q<?xml version="1.0" encoding="UTF-8"?>\E\R~u', $sequenced_xml, 2, PREG_SPLIT_NO_EMPTY);
    
    $xml  = new SimpleXMLElement($split[1]);
    
    $soap = 'http://www.w3.org/2003/05/soap-envelope';
    $body = $xml->children($soap)->Body;
    
    $wsen = 'http://schemas.xmlsoap.org/ws/2004/09/enumeration';
    $xml->registerXPathNamespace('wsen', $wsen);
    
    $enumerated = $body->xpath('.//wsen:*/*[not(namespace-uri(.) = namespace-uri(..))]');
    $enumerated = array_map('simplexml_export_element', $enumerated);
    
    foreach ($enumerated as $item) {
        echo $item->getName(), "\n";
        foreach ($item as $key => $value) {
            printf(" - %s: %s\n", $key, $value);
        }
    }
    

    Output:

    DCIM_SystemView
     - AssetTag: 
     - BIOSReleaseDate: 11/20/2013
     - BIOSVersionString: 2.1.3
     - BaseBoardChassisSlot: NA
     - BatteryRollupStatus: 1
     - BladeGeometry: 255
     - BoardPartNumber: 061P35A00
     - BoardSerialNumber: CN70163231007K
     - CMCIP: 
     - CPLDVersion: 1.0.3
     - CPURollupStatus: 1
     - ChassisModel: 
     - ChassisName: Main System Chassis
     - ChassisServiceTag: REMOVED
     - ChassisSystemHeight: 2
     - DeviceDescription: System
     - ExpressServiceCode: 33088672189
     - FQDD: System.Embedded.1
     - FanRollupStatus: 1
     - HostName: 
     - InstanceID: System.Embedded.1
     - LastSystemInventoryTime: 20140928010936.000000+000
     - LastUpdateTime: 20140220171215.000000+000
     - LicensingRollupStatus: 1
     - LifecycleControllerVersion: 2.1.0
     - Manufacturer: Dell Inc.
     - MaxCPUSockets: 2
     - MaxDIMMSlots: 24
     - MaxPCIeSlots: 6
     - MemoryOperationMode: OptimizerMode
     - Model: PowerEdge R720xd
     - NodeID: F7852V1
     - PSRollupStatus: 1
     - PlatformGUID: 3156324f-c0c6-3580-3810-00374c4c4544
     - PopulatedCPUSockets: 2
     - PopulatedDIMMSlots: 8
     - PopulatedPCIeSlots: 2
     - PowerCap: 598
     - PowerCapEnabledState: 3
     - PowerState: 2
     - PrimaryStatus: 1
     - RollupStatus: 1
     - ServerAllocation: 
     - ServiceTag: REMOVED
     - StorageRollupStatus: 1
     - SysMemErrorMethodology: 6
     - SysMemFailOverState: NotInUse
     - SysMemLocation: 3
     - SysMemMaxCapacitySize: 1572864
     - SysMemPrimaryStatus: 1
     - SysMemTotalSize: 65536
     - SystemGeneration: 12G Monolithic
     - SystemID: 1320
     - SystemRevision: 0
     - TempRollupStatus: 1
     - UUID: 4c4c4544-0037-3810-8035-c6c04f325631
     - VoltRollupStatus: 1
     - smbiosGUID: 44454c4c-3700-1038-8035-c6c04f325631
    

    Hope this is still helpful in your case.