Search code examples
phpperformancemongodblithium

Extremely slow lithium query, fast in MongoDB


So, I've been trying out the php framework lithium now and it seems like a really good framework and all but I have a slight problem. A query I run on a collection with only 6k+ documents is amazingly slow from php but blazingly fast when I run it from the terminal.

One document in the collection may look like this:

{
    "_id" : ObjectId("504c9a3b6070d8b7ea61938e"),
    "startDate" : "Jan 2011",
    "episodes" : [
        {
            "title" : "Series 1, Episode 1",
            "airdate" : ISODate("2011-01-20T00:00:00Z"),
            "epnum" : "1",
            "prodnum" : null,
            "seasonnum" : "01",
            "link" : "http://www.tvrage.com/10_OClock_Live/episodes/1065007783"
        },
        {and maybe 20 more},
    ],
    "runTime" : "60 min",
    "endDate" : "Apr 2012",
    "network" : "Channel 4",
    "numberOfEpisodes" : "25 eps",
    "title" : "10 O'Clock Live",
    "directory" : "10OClockLive",
    "country" : "UK",
    "tvrage" : "27363"
}

I want to get all episodes that exists for this current month. So in the terminal (I use fake values and more than a month) I use the following query:

db.series.find({'episodes.airdate': {$gt: ISODate('2012-09-07 00:00:00'), $lt: ISODate('2012-11-01')}})

And wham, it just goes very fast. Even if I do an explain() on the query it tells me that it's fast:

{
    "cursor" : "BtreeCursor episodes.airdate_1",
    "isMultiKey" : true,
    "n" : 382,
    "nscannedObjects" : 1620,
    "nscanned" : 1620,
    "nscannedObjectsAllPlans" : 1620,
    "nscannedAllPlans" : 1620,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 0,
    "nChunkSkips" : 0,
    **"millis" : 181**,
    "indexBounds" : {
        "episodes.airdate" : [
            [
                ISODate("2012-09-07T00:00:00Z"),
                ISODate("292278995-01--2147483647T07:12:56.808Z")
            ]
        ]
    },
    "server" : "example:27017"
}

But when I use the query inside php and lithium, man, it take ages:

$series = Series::find('all', array(
                'fields' => array('title', 'episodes.title', 'episodes.airdate'),
                'conditions' => array('episodes.airdate' => array('$gt' => new MongoDate(strtotime(date('Y-m-01'))), '$lt' =>  new MongoDate(strtotime(date('Y-m-t')))))
            ));

And if I even try to loop through it, then it's even worse well past the 30 second execution time. All though, I think I have a memory leak since I had to add this ini_set('memory_limit', '-1'); without getting a "maxium usage" or whatever.

Could anyone provide me with a answer on why this is happening? Is there any way to improve the speed of the query? I have no idea why it is so slow and I would be super glad if anyone could point me in the right direction.


Solution

  • The issue is that Lithium boxes all the data in objects, which for large queries can be very memory-intensive, hence slow. If you don't need any ActiveRecord features for that particular query, there's an option you can pass to find() which gets passed to MongoDb::read() (so check the docs for MongoDb::read()) that allows you to get back either a raw array, or the actual database cursor which you can iterate over manually.

    The other option is to wait till I implement streaming iteration, which will solve the memory problem. :-)