I have a 5MB XML file
I'm using the following code to get all nodeValue
$dom = new DomDocument('1.0', 'UTF-8');
if(!$dom->load($url))
return;
$games = $dom->getElementsByTagName("game");
foreach($games as $game)
{
}
This takes 76 seconds and there are around 2000 games
tag. Is there any optimization or other solution to get the data?
You shouldn't use the Document Object Model on large XML files, it is intended for human readable documents, not big datasets!
If you want fast access you should use XMLReader or SimpleXML.
XMLReader is ideal for parsing whole documents, and SimpleXML has a nice XPath function for retreiving data quickly.
For XMLReader you can use the following code:
<?php
// Parsing a large document with XMLReader with Expand - DOM/DOMXpath
$reader = new XMLReader();
$reader->open("tooBig.xml");
while ($reader->read()) {
switch ($reader->nodeType) {
case (XMLREADER::ELEMENT):
if ($reader->localName == "game") {
$node = $reader->expand();
$dom = new DomDocument();
$n = $dom->importNode($node,true);
$dom->appendChild($n);
$xp = new DomXpath($dom);
$res = $xp->query("/game/title"); // this is an example
echo $res->item(0)->nodeValue;
}
}
}
?>
The above will output all game titles (assuming you have /game/title
XML structure).
For SimpleXML you can use:
$xml = file_get_contents($url);
$sxml = new SimpleXML($xml);
$games = $sxml->xpath('/game'); // returns an array of SXML nodes
foreach ($games as $game)
{
print $game->nodeValue;
}