Search code examples
mongodbindexinggeospatialcompound-index

MongoDB, how to create index for query containing a geospatial query, a range query and sort on different columns?


So if I have a query that does the following (in pseudo code)

find(a nearby x, b > y).sort(c)

where a is a geo column, b is type of long, and c is also a type of long

Would the compound index on (a:2d, b:1, c:1) work and suggested?


Solution

  • Geospatial queries have their own index category (as you mention), and the geohashing greatly improves the index performance of the first key lookup--it's better than a range if you can set it up right. In any case, I think your strategy will work: the key will be setting $maxDistance to something fairly small.

    I inserted 10 million random geo records to match your description, like so:

    { "_id" : ObjectId("4f28e1cffc90631d239f8b5a"), "a" : [ 46, 47 ], "b" : ISODate("2012-02-01T06:53:25.543Z"), "c" : 19 }
    { "_id" : ObjectId("4f28e1bdfc90631d239c4272"), "a" : [ 54, 48 ], "b" : ISODate("2012-02-01T06:53:32.699Z"), "c" : 20 }
    { "_id" : ObjectId("4f28e206fc90631d23aac59d"), "a" : [ 46, 52 ], "b" : ISODate("2012-02-01T06:55:14.103Z"), "c" : 22 }
    { "_id" : ObjectId("4f28e1a7fc90631d23995700"), "a" : [ 54, 52 ], "b" : ISODate("2012-02-01T06:52:33.312Z"), "c" : 27 }
    { "_id" : ObjectId("4f28e1d7fc90631d23a0e9e7"), "a" : [ 52, 46 ], "b" : ISODate("2012-02-01T06:53:11.315Z"), "c" : 31 }
    

    With the maxDistance at something below 10 the performance is really quite good.

    db.test13.find({a:{$near:[50,50], $maxDistance:4}, b:{$gt:d}}).sort({c:1}).explain();
    {
    "cursor" : "GeoSearchCursor",
    "nscanned" : 100,
    "nscannedObjects" : 100,
    "n" : 100,
    "scanAndOrder" : true,
    "millis" : 25,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "isMultiKey" : false,
    "indexOnly" : false,
    "indexBounds" : {
    
    }
    }
    

    If you leave out maxDistance it starts to suffer quite a bit. Some of the queries took up to 60 seconds to run. The secondary range parameter doesn't seem to help much, even if the range is fairly narrow--it seems to be all about the maxDistance.

    I recommend you play around with it to get a feel for how the geospatial index performs. Here is my test insert loop. You can try limiting the bits as well for less resolution

    function getRandomTime() {
       return new Date(new Date() - Math.floor(Math.random()*1000000));
    }
    
    function getRandomGeo() {
       return [Math.floor(Math.random()*360-180),Math.floor(Math.random()*360-180)];
    }
    
    function initialInsert() {
       for(var i = 0; i < 10000000; i++) {
          db.test13.save({
             a:getRandomGeo(),
             b:getRandomTime(),
             c:Math.floor(Math.random()*1000)
          });
       }
    }