Search code examples
phpsessionmemory-managementmemory-limit

How to manage PHP memory?


I wrote a one-off script that I use to parse PDFs saved on the database. So far it is working okay until I ran out of memory after parsing 2,700+ documents.

The basic flow of the script is as follows:

  1. Get a list of all the document IDs to be parsed and save it as an array in the session (~155k documents).
  2. Display a page that has a button to start parsing
  3. Make an AJAX request when that button is clicked that would parse the first 50 documents in the session array

$files = $_SESSION['files'];

$ids = array();

$slice = array_slice($files, 0, 50);
$files = array_slice($files, 50, null); // remove the 50 we are parsing on this request

if(session_status() == PHP_SESSION_NONE) {
  session_start();
}
$_SESSION['files'] = $files;
session_write_close();

for($i = 0; $i < count($slice); $i++) {
  $ids[] = ":id_{$i}";
}
$ids = implode(", ", $ids);

$sql = "SELECT d.id, d.filename, d.doc_content
  FROM proj_docs d
  WHERE d.id IN ({$ids})";

$stmt = oci_parse($objConn, $sql);
for($i = 0; $i < count($slice); $i++) {
  oci_bind_by_name($stmt, ":id_{$i}", $slice[$i]);
}
oci_execute($stmt, OCI_DEFAULT);
$cnt = oci_fetch_all($stmt, $data);
oci_free_statement($stmt);

# Do the parsing..
# Output a table row..

  1. The response to the AJAX request typically includes a status whether the script has finished parsing the total ~155k documents - if it's not done, another AJAX request is made to parse the next 50. There's a 5 second delay between each request.

Questions

  • Why am I running out of memory when I was expecting that peak memory usage would be when I get a list of all the document IDs on #1 since it holds all of the possible documents not a few minutes later when the session array holds 2,700 elements less?
  • I saw a few questions similar to my problem and they suggested to either set the memory to unlimited which I don't want to do at all. The others suggested to set my variables to null when appropriate and I did that but I still ran out of memory after parsing ~2,700 documents. So what other approaches should I try?

# Freeing some memory space
$batch_size = null;
$with_xfa = null;
$non_xfa = null;
$total = null;
$files = null;
$ids = null;
$slice = null;
$sql = null;
$stmt = null;
$objConn = null;
$i = null;
$data = null;
$cnt = null;
$display_class = null;
$display = null;
$even = null;
$tr_class = null;

Solution

  • So I'm not really sure why but reducing the number of documents I'm parsing from 50 down to 10 for each batch seems to fix the issue. I've gone past 5,000 documents now and the script is still running. My only guess is that when I was parsing 50 documents I must have encountered a lot of large files which used up all of the memory allotted.

    Update #1

    I got another error about memory running out at 8,500+ documents. I've reduced the batches further down to 5 documents each and will see tomorrow if it goes all the way to parsing everything. If that fails, I'll just increase the memory allocated temporarily.

    Update #2

    So it turns out that the only reason why I'm running out of memory is that we apparently have multiple PDF files that are over 300MB uploaded on to the database. I increased the memory allotted to PHP to 512MB and this seems to have allowed me to finish parsing everything.