I have a large Mongo collection I want to iterate over so I do something like that:
$cursor = $mongo->my_big_collection->find([]);
foreach ($cursor as $doc)
do_something();
But I eventually run out of memory. I expected the cursor to free the memory after each document was processed. Why isn't that the case?
I tried calling unset($doc)
at the end of my loop but that didn't help.
Right now I have to do something like this to get around the issue (processing the documents by batch and calling unset()
on the cursor after each batch):
for ($skip = 0; true; $skip += 1000)
{
$cursor = $mongo->my_big_collection->find()->skip($skip)->limit(1000);
if (!$cursor->hasNext())
break;
foreach ($cursor as $doc)
do_something();
unset($cursor);
}
This seems awkward. The whole point of iterators is to not have to do this. Is there a better way?
I'm using hhvm 3.12 with mongofill.
Thank you for your help.
MongoCursor.php
/**
* Advances the cursor to the next result
*
* @return void - NULL.
*/
public function next()
{
$this->doQuery();
$this->fetchMoreDocumentsIfNeeded(); // <<< add documents to $this->documents
$this->currKey++;
}
/**
* Return the next object to which this cursor points, and advance the
* cursor
*
* @return array - Returns the next object.
*/
public function getNext()
{
$this->next();
return $this->current();
}
When you iterate through the cursor, it will store in the cursors all the documents $this->documents
.
Nothing clear this collection of document.
You could try to implement an iteration that remove the documents of $this->documents
after getting them maybe ?