Search code examples
phpxmlparsingcrystal-reportssimplexml

PHP simpleXML trying to process fairly complex file


The file I have to work with has the following structure:

<?xml version="1.0" encoding="UTF-8" ?>
<FormattedReport xmlns = 'urn:crystal-reports:schemas' xmlns:xsi = 'http://www.w3.org/2000/10/XMLSchema-instance'>
    <FormattedAreaPair Level="0" Type="Report">
    <FormattedAreaPair Level="1" Type="Details">
    <FormattedArea Type="Details">
        <FormattedSections>
        <FormattedSection SectionNumber="0">
        <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
    <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
 <FormattedAreaPair Level="1" Type="Details">
    <FormattedArea Type="Details">
        <FormattedSections>
        <FormattedSection SectionNumber="0">
        <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
    <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
        </FormattedAreaPair>
        </FormattedReport>

So what I'm trying to do, is call a PHP function which will parse the XML and eventually store it in an SQL DB.

for example:

ManifestNR: 1903 ShippingDate: 12/04/2011 CarrierID: TNT03 TrackingRef: 234234232 ... etc for each record ...

so i've set about trying to do this using DOM and then stumbled across simpleXML, I've read several tuts, and searched implementations here but I just can't seem to access the data in the final nodes (or any other data tbh). Is simpleXML a no-no with these kind of structures?

The latest PHP I'm using is:

<?php

if (file_exists('tracking.xml')) {
    $xml = simplexml_load_file('tracking.xml');

  //  print_r($xml);

   foreach( $xml as $FormattedReport->FormattedAreaPair->FormattedAreaPair ) 
        {
        foreach($FormattedReport as $node->FormattedArea->FormattedSections->FormattedSection->FormattedReportObjects)
        echo $node->FormattedReportObject->Value;
        }

} else {
    exit('Failed to open xml');
}
?>

I've tried to strip it right back to basics, but still no luck. Doesn't echo a result.

Thanks for your time guys!

SOLVED

Anyone in similar circumstances heres a bit of direction.

  1. ignore the root node, thats your default $variable when you import the XML string/file
  2. If you have nested groups create a node to the parent first like so $xml->FormattedAreaPair->FormattedAreaPair as $parentnode
  3. Using your parent node loop through all the children
  4. If you have an attribute field access it as follows: (string) $node['FieldName'])
  5. Compare the retrieved attribute with a string and then handle the result.
  6. Stop pulling your hair out.

    //print_r($xml); foreach( $xml->FormattedAreaPair->FormattedAreaPair as $parentnode ) { foreach($parentnode->FormattedArea->FormattedSections->FormattedSection->FormattedReportObjects->FormattedReportObject as $node){ //echo "FormattedValue: ".$node->FormattedValue."<br />"; switch((string) $node['FieldName']){ case '{tblCon.ManifestNR}': echo 'Manifest: '.$node->FormattedValue."<br />"; break; case '{tblCon.ShippingDate}': echo 'Shipping Date: '.$node->FormattedValue."<br />"; break; case '{tblCon.CarrierID}': echo 'Carrier ID: '.$node->FormattedValue."<br />"; break; case '{tblCon.CustConRefTX}': echo 'Customer Reference: '.$node->FormattedValue."<br />"; break; case '{tblCon.ServiceCodeTX}': echo 'Service Code: '.$node->FormattedValue."<br />"; break; case '{tblCon.TotalWeightNR}': echo 'Total Weight: '.$node->FormattedValue."<br />"; break; case '{tblCon.ValueNR}': echo 'Value: '.$node->FormattedValue."<br />"; break; case '{tblCon.TotalVolumeNR}': echo 'Total Volume: '.$node->FormattedValue."<br />"; break; case '{tblCon.GoodsDesc}': echo 'Goods Description: '.$node->FormattedValue."<br />"; break; case '{tblConAddr.ReceiverNameTX}': echo 'Receiver Name: '.$node->FormattedValue."<br />"; break; case '{@SalesOrder}': echo 'Sales Order: '.$node->FormattedValue."<br />"; break; case '{@TrackingReference}': echo 'Tracking Reference: '.$node->FormattedValue."<br />"; break; } } echo "---------------------------- <br />"; } } else { exit('Failed to open xml'); } ?>

Solution

  • The examples in the Manual should suffice (Example #4 in particular). You seem like a sufficiently clever fellow. The problem is that you're doing it wrong.

    example.php

    <?php
    $xmlstr = <<<XML
    <?xml version='1.0' standalone='yes'?>
    <movies>
     <movie>
      <title>PHP: Behind the Parser</title>
      <characters>
       <character>
        <name>Ms. Coder</name>
        <actor>Onlivia Actora</actor>
       </character>
       <character>
        <name>Mr. Coder</name>
        <actor>El Act&#211;r</actor>
       </character>
      </characters>
      <plot>
       So, this language. It's like, a programming language. Or is it a
       scripting language? All is revealed in this thrilling horror spoof
       of a documentary.
      </plot>
      <great-lines>
       <line>PHP solves all my web problems</line>
      </great-lines>
      <rating type="thumbs">7</rating>
      <rating type="stars">5</rating>
     </movie>
    </movies>
    XML;
    ?>
    

    Example #4

    <?php
    include 'example.php';
    
    $xml = new SimpleXMLElement($xmlstr);
    
    /* For each <character> node, we echo a separate <name>. */
    foreach ($xml->movie->characters->character as $character) {
       echo $character->name, ' played by ', $character->actor, PHP_EOL;
    }
    
    ?>
    

    Notice that when using the foreach construct you need to specify the path to the nodes of a certain type. The second item in the foreach is just an (empty) variable that you use to store the current node in the iteration.