So, I've been trying out the php framework lithium now and it seems like a really good framework and all but I have a slight problem. A query I run on a collection with only 6k+ documents is amazingly slow from php but blazingly fast when I run it from the terminal.
One document in the collection may look like this:
{
"_id" : ObjectId("504c9a3b6070d8b7ea61938e"),
"startDate" : "Jan 2011",
"episodes" : [
{
"title" : "Series 1, Episode 1",
"airdate" : ISODate("2011-01-20T00:00:00Z"),
"epnum" : "1",
"prodnum" : null,
"seasonnum" : "01",
"link" : "http://www.tvrage.com/10_OClock_Live/episodes/1065007783"
},
{and maybe 20 more},
],
"runTime" : "60 min",
"endDate" : "Apr 2012",
"network" : "Channel 4",
"numberOfEpisodes" : "25 eps",
"title" : "10 O'Clock Live",
"directory" : "10OClockLive",
"country" : "UK",
"tvrage" : "27363"
}
I want to get all episodes that exists for this current month. So in the terminal (I use fake values and more than a month) I use the following query:
db.series.find({'episodes.airdate': {$gt: ISODate('2012-09-07 00:00:00'), $lt: ISODate('2012-11-01')}})
And wham, it just goes very fast. Even if I do an explain() on the query it tells me that it's fast:
{
"cursor" : "BtreeCursor episodes.airdate_1",
"isMultiKey" : true,
"n" : 382,
"nscannedObjects" : 1620,
"nscanned" : 1620,
"nscannedObjectsAllPlans" : 1620,
"nscannedAllPlans" : 1620,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
**"millis" : 181**,
"indexBounds" : {
"episodes.airdate" : [
[
ISODate("2012-09-07T00:00:00Z"),
ISODate("292278995-01--2147483647T07:12:56.808Z")
]
]
},
"server" : "example:27017"
}
But when I use the query inside php and lithium, man, it take ages:
$series = Series::find('all', array(
'fields' => array('title', 'episodes.title', 'episodes.airdate'),
'conditions' => array('episodes.airdate' => array('$gt' => new MongoDate(strtotime(date('Y-m-01'))), '$lt' => new MongoDate(strtotime(date('Y-m-t')))))
));
And if I even try to loop through it, then it's even worse well past the 30 second execution time. All though, I think I have a memory leak since I had to add this ini_set('memory_limit', '-1');
without getting a "maxium usage" or whatever.
Could anyone provide me with a answer on why this is happening? Is there any way to improve the speed of the query? I have no idea why it is so slow and I would be super glad if anyone could point me in the right direction.
The issue is that Lithium boxes all the data in objects, which for large queries can be very memory-intensive, hence slow. If you don't need any ActiveRecord features for that particular query, there's an option you can pass to find()
which gets passed to MongoDb::read()
(so check the docs for MongoDb::read()
) that allows you to get back either a raw array, or the actual database cursor which you can iterate over manually.
The other option is to wait till I implement streaming iteration, which will solve the memory problem. :-)