Search code examples
phpxmlxmlreader

PHP working with XMLReader with HUGE data source


I need to parse a ginormous data source (14.9M lines of XML, 1.7GB).

I am having problems working with XMLReader to do this. I haven't before needed anything but SimpleXML, but given that I really can't load this whopper into memory I will need to do this via stream.

I have written this code:

<?php

$xml = new XMLReader(); 
$xml->open('public.xml'); 


while($xml->read())
{
    echo '.';
}
$xml->close();
?>

But am having issues with execution. Namely, I get "Fatal error: Maximum execution time of 30 seconds exceeded..."

When I do set_time_limit(600) the browser just crashes.

It is crashing because it can't handle the number of "." created?

What do you recommend here? Ultimately, I need this XML file into a relational database. I am testing feasibility before I get into the detail of schema.


Solution

  • It is crashing because it can't handle the number of "." created?

    To test this simply try it without echo '.';.
    As you need a lot of RAM for this increase the maximal memory a script can use. Eventually split the XML File in smaller parts and process them sequentially.

    Eventually look at: