Search code examples
phpxmlxmlreader

Using XMLReader with Nested Values


I'm working on parsing a large (approx 1.5 MB) XML file with PHP. The node that I am wanting to focus on is about 2 levels deep and for each of those nodes, I want to be able to pull certain values.

I was hoping to use SimplePie to do this but, from what I've read, XMLReader appears to be the best way to do this. I've never used XMLReader and was testing out this example. Unfortunately though, it is not working for me.

Here is (some of) the XML:

<?xml version="1.0" encoding="UTF-8"?>
  <comiclist>
    <comic>
      <id>117</id>
      <index>1</index>
      <mainsection>
        <pagecount>33</pagecount>
        <credits>
          <credit>
            <role id="dfPenciler">Penciller</role>
            <roleid>dfPenciler</roleid>
            <person>
              <displayname>Jim Lawson</displayname>
              <sortname>Jim Lawson</sortname>
            </person>
          </credit>
          <credit>
            <role id="dfWriter">Writer</role>
            <roleid>dfWriter</roleid>
            <person>
              <displayname>Peter Laird</displayname>
              <sortname>Peter Laird</sortname>
            </person>
          </credit>
        </credits>
        <characters/>
        <series>
          <displayname>Teenage Mutant Ninja Turtles</displayname>
          <sortname>Teenage Mutant Ninja Turtles</sortname>
          <complete>No</complete>
          <bpseriesid>0</bpseriesid>
        </series>
      </mainsection>
      <collectionstatus listid="3">In Collection</collectionstatus>
      <rare boolvalue="0">No</rare>
      <coverfront>/Data/Images/tmnt_2.jpg</coverfront>
      <format>
        <displayname>Standard Comic Format</displayname>
        <sortname>Standard Comic Format</sortname>
      </format>
      <publisher>
        <displayname>Mirage Studios</displayname>
        <sortname>Mirage Studios</sortname>
      </publisher>
      <country>
        <displayname>USA</displayname>
        <sortname>USA</sortname>
      </country>
      <language>
        <displayname>English</displayname>
        <sortname>English</sortname>
      </language>
      <store>
        <displayname>All About Books &amp; Comics</displayname>
        <sortname>All About Books &amp; Comics</sortname>
      </store>
      <purchaseprice>$2.95</purchaseprice>
      <coverprice>$2.95</coverprice>
      <purchasedate>
        <year>
          <displayname>2003</displayname>
        </year>
        <month>1</month>
        <date>January 2003</date>
      </purchasedate>
      <condition>
        <displayname>Near Mint</displayname>
        <sortname>094 Near Mint</sortname>
        <lastname>094 Near Mint</lastname>
      </condition>
      <issuenr>2</issuenr>
      <publicationdate>
        <year>
          <displayname>2002</displayname>
        </year>
        <month>2</month>
        <date>February 2002</date>
      </publicationdate>
      <genres>
        <genre>
          <displayname>Science Fiction</displayname>
          <sortname>Science Fiction</sortname>
        </genre>
      </genres>
      <tags/>
      <links/>
      <lastmodified>
        <date>10/4/2007 6:17:29 AM</date>
      </lastmodified>
      <thumbfilepath>/Thumbnails/6108a98d11f81eee6dbd2a67c20b1650.jpg</thumbfilepath>
      <sections/>
      <seriesgroup>
        <displayname>Other</displayname>
        <sortname>Other</sortname>
      </seriesgroup>
      <issue>2</issue>
      <quantity>1</quantity>
      <bpcomicid>0</bpcomicid>
      <bpcomiclastreceivedrevision>0</bpcomiclastreceivedrevision>
      <bpseriesid>0</bpseriesid>
      <wraparoundcover boolvalue="0">No</wraparoundcover>
      <seriefirstletter>
        <displayname>T</displayname>
        <sortname>T</sortname>
      </seriefirstletter>
      <allcreators>Jim Lawson; Peter Laird</allcreators>
      <submissiondate/>
      <releasedate/>
      <readingdate/>
      <readtimes>0</readtimes>
      <readit>No</readit>
    </comic>
  </comiclist>
</comicinfo>

Here is the PHP I am using:

<?php
$z = new XMLReader;
$z->open('comiclist.xml');

$doc = new DOMDocument;

while ($z->read() && $z->name !== 'comic');

while ($z->name === 'comic')
{

    $node = simplexml_import_dom($doc->importNode($z->expand(), true));

    var_dump($node->element_1);

    $z->next('comic');
}

?>

What is being displayed is this:

object(SimpleXMLElement)#3 (0) { } object(SimpleXMLElement)#4 (0) { }

This is repeated over and over again, for each node. What am I doing wrong and is there a better way to do what I'm trying to accomplish?


Solution

  • I managed to solve the issue myself.

    Through a few hours of trial & error (and research) I have figured out how to accomplish what I was asking for. Test code posted below for others. This prints out 3 of the values for each 'comic' node:

    <?php
      $xml = simplexml_load_file('comiclist.xml');
    
      foreach ($xml->comiclist->comic as $comic) {
        echo $comic->mainsection->series->displayname . ' #' . $comic->issuenr . ' is ID number: ' . $comic->id . '<br />';
      }
    ?>