I'm working on a Symfony 2 based project and need to regularly import 1.5 million products from a .csv document using a cron task.
The import currently looks like this:
...
Propel::disableInstancePooling();
$fp = fopen($file, 'r');
while ( !feof($fp) ) {
$line = fgets($fp, 2048);
$data = str_getcsv($line, "\t");
$product = new Product();
$product->setId($data[0]);
$product->setTitle($data[1]);
...
$product->save();
}
...
However after about 5 minutes Apache hits its 512MB memory allocation and throws the following error:
Fatal error: Allowed memory size of 536870912 bytes exhausted (tried to allocate 78 bytes) in /var/www/vhosts/myapp.local/app/cache/dev/classes.php on line 9451
I've disabled Propel's instance pooling (this has worked in the past in previous Propel powered imports in other frameworks) and the problem is still occuring so I'm wondering whether Symfony 2 is performing any kind of caching (based on the error being thrown).
Does anyone have any experience on performing large imports in Symfony 2 that can point me in the right direction?
Actually, this import should be done on mysql level directly (mysql can read data from csv files) unless it is very complex. The other approach is to read the file incrementally, generate a sql file and importing it into the database after that natively. There is no reason for such large imports to be handled by a web application directly, because you will run out of memory.
Make sure that you don't load the entire csv into memory at once.
Additional solution is to give php more memory in php.ini, and hope for the best. Php is not the best language for batch processing large amounts of data.