Search code examples
mongodbmongodb-querynosqlaggregation-frameworkmongodb-indexes

Ensuring that the $sort stage will only hold the number of documents specified in the $limit stage


Consider the case where $limit comes immediately after $sort:

db.col.aggregate([
  { $match: { } },
  { $sort: { } },
  { $limit: 10 }
])

I would assume that in the query above, the $sort stage would hold at most 10 documents, and hence RAM is no consideration.

But what if the query has more stages between the $sort and $limit that may filter out documents:

db.col.aggregate([
  { $match: { } } },
  { $sort: { } },

  { $set: { } },
  { $match: { } },

  { $limit: 10 }
])

In this query, would it still be possible to have the $sort stage to hold no more than 10 documents? If 10 documents make through to the $limit stage, then it should able to let the $sort stage know, right?


Solution

  • Each document that has passed through the previous stage of the pipeline, will pass through the next stage in the pipeline. You have two cases, as per the question:

    1. When $sort and $limit are consecutive, in this case, MongoDB only keeps the top N sorted documents in the memory, where N is the limit. This is why you don't see any memory-based errors here.

    2. In your second case, where $sort and $limit have some stages in between, then this optimization is not possible, as the intermediate stages may alter the output in some unknown manner, like grouping or filtering, etc. Hence you might encounter memory-based issues, because $sort tries to keep all the documents in memory.

    $sort documentation.