I have a very large table which I want to process row by row in PHP.
Here is what I have tried:
PDO::fetchAll()
Loads all rows in memory immediately: out of memory error.
Paginating with PDO::fetchAll()
and LIMIT X, 1000
(i.e. splitting the large query in smaller ones)
I was surprised to see that the memory usage kept growing continuously until out of memory (at least until 1Gb). Why is the memory not released between queries?
I have no global variables, no class/object properties, the pagination is encapsulated in a method so the local variables (rows fetched from db) should be collected by the garbage collector once the current page has been processed… I also do not store the rows in memory, I write it to disk.
Also the bad thing with this technique is that it results in several queries instead of one, which are slow on the very large table.
PDO::fetchAll()
with buffered queries
$pdo->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, true);
Finally the memory usage stays very low (~13Mb) but the process is very slow as each row means a network access to the MySQL server.
So my questions:
Or do you see another better solution?
Probably I don't understand your question... But, i created simple script that iterates all values from table with ~ 5,436,226
rows (and 19
columns), and save it into out file. I used PostgeSQL instead MySQL (but I do not think that this is a problem, you must change only LIMIT section).
<?
ini_set('memory_limit', '100M');
$pdo = new PDO('pgsql:host=localhost;port=5432;dbname=test', 'postgres', 'postgres');
$page = 0;
while ($pdo) {
echo ($page++).PHP_EOL;
$stmt = $pdo->prepare('SELECT * FROM table ORDER BY id LIMIT 100 OFFSET '.($page*100));
$stmt->execute();
$rows = $stmt->fetchAll(PDO::FETCH_ASSOC);
file_put_contents('/var/www/test/tmp/out.txt', json_encode($rows), FILE_APPEND);
}
Out file size is ~1 Gb.