Search code examples
mongodbsortingaggregation-frameworkmongodb-indexes

Index vs Aggregation Pipeline for Sorting


I'm developing an application using MongoDB as its database, and for sorting data, I encountered an interesting argument from a colleague that index can be used instead of aggregation pipeline for getting sorted data.

I tried this and it actually works; using an index with specified field and direction does return sorted data when queried. When using aggregation pipeline, I also obtained the same result.

I have created an index with the following specification:

index name: batch_deleted_a_desc

num: asc
marked: asc
value: desc

Using aggregation pipeline:

> db.test.aggregate([{$match: {num:"3",marked:false}}, {$sort:{"value":-1}}])
{ "_id" : ObjectId("5d70b40ba7bebd3d7c135615"), "value" : 4, "marked" : false, "num" : "3" }
{ "_id" : ObjectId("5d70b414a7bebd3d7c135616"), "value" : 2, "marked" : false, "num" : "3" }
{ "_id" : ObjectId("5d70b3fea7bebd3d7c135614"), "value" : 1, "marked" : false, "num" : "3" }

Using index:

> db.test.find({num:"3",marked:false})
{ "_id" : ObjectId("5d70b40ba7bebd3d7c135615"), "value" : 4, "marked" : false, "num" : "3" }
{ "_id" : ObjectId("5d70b414a7bebd3d7c135616"), "value" : 2, "marked" : false, "num" : "3" }
{ "_id" : ObjectId("5d70b3fea7bebd3d7c135614"), "value" : 1, "marked" : false, "num" : "3" }

As you can see, the results are the same. But I am unsure that using index for getting sorted data is a good practice, and yet using aggregation pipeline is (sometimes) taking more effort than just creating index.

So, which would be the best option?


Solution

  • In the context of the question, the better option would be the aggregation because it explicitly specifies the sort.

    In the query example, results are being returned in order specified by the index because the query is using the index { num: 1, marked: 1, value: 1}. However, nothing specified in the query will guarantee that ordering, meaning results may change at some point in the future. For example, consider the case where the index { num: 1, marked: 1, updated_at: 1 } were to be created. The query planner may decide to use this index instead, which may result in results in a different order.

    In the absence of a sort, a query would return results in the order of the index being used, but you should not rely upon that ordering without explicitly specifying it. Quoting the docs:

    Unless you specify the sort() method or use the $near operator, MongoDB does not guarantee the order of query results.