Search code examples
mongodbdatabase-indexesmongodb-indexes

Retrieving a section of a compound index


I have a large collection where I would like to retrieve results ordered by two fields. However, I would like to start retrieving results from deep within the results in a performant way.

For example, if the documents are:

{ age: 25, name: "Gabe" }
{ age: 25, name: "John" }
{ age: 25, name: "Mike" }
{ age: 26, name: "Gabe" }
{ age: 26, name: "John" }
{ age: 26, name: "Mike" }

I want to retrieve the documents ordered by { age: 1, name: 1 }. And I want to index directly into { age: 26, name: "John" } and retrieve the next N documents as ordered by the compound index (e.g., { age: 26, name: "Mike" }). The purpose is to return paged search results to a user.


Solution

  • That is exactly how mongodb indexed queries work.

    MongoDB indexes are similar to b-tree structures. When a compound index is used for selection and sorting, the query executor will begin scanning the index at the first matching value, and scan through the last match. The documents identified are then fetched to pass on to any further query stages.

    Some queries may require scanning multiple ranges of the index. The example query, scanning a single range of the index will find all matching documents in order, so not additional sorting is necessary.

    This query will need 2 parts, one part to match the first age and the partial list of name, and one to match the remaining ages. This can be done with the $or operator.

    To demonstrate this, I used explain with the executionStats option:

    db.collection.explain("executionStats").find({$or:[{age:{$eq:25},name:{$gte:"John"}},{age:{$gt:25}}]}).sort({age:1,name:1})
    

    And the winning plan shows two index scans, passing their results to a merge stage. Note that the fetch stage does not note any further filtering, and there is no in-memory sort:

    "winningPlan" : {
                "stage" : "SUBPLAN",
                "inputStage" : {
                    "stage" : "FETCH",
                    "inputStage" : {
                        "stage" : "SORT_MERGE",
                        "sortPattern" : {
                            "age" : 1,
                            "name" : 1
                        },
                        "inputStages" : [
                            {
                                "stage" : "IXSCAN",
                                "keyPattern" : {
                                    "age" : 1,
                                    "name" : 1
                                },
                                "indexName" : "age_1_name_1",
                                "isMultiKey" : false,
                                "multiKeyPaths" : {
                                    "age" : [ ],
                                    "name" : [ ]
                                },
                                "isUnique" : false,
                                "isSparse" : false,
                                "isPartial" : false,
                                "indexVersion" : 2,
                                "direction" : "forward",
                                "indexBounds" : {
                                    "age" : [
                                        "[25.0, 25.0]"
                                    ],
                                    "name" : [
                                        "[\"John\", {})"
                                    ]
                                }
                            },
                            {
                                "stage" : "IXSCAN",
                                "keyPattern" : {
                                    "age" : 1,
                                    "name" : 1
                                },
                                "indexName" : "age_1_name_1",
                                "isMultiKey" : false,
                                "multiKeyPaths" : {
                                    "age" : [ ],
                                    "name" : [ ]
                                },
                                "isUnique" : false,
                                "isSparse" : false,
                                "isPartial" : false,
                                "indexVersion" : 2,
                                "direction" : "forward",
                                "indexBounds" : {
                                    "age" : [
                                        "(25.0, inf.0]"
                                    ],
                                    "name" : [
                                        "[MinKey, MaxKey]"
                                    ]
                                }
                            }
                        ]
                    }
                }
            },
    

    The executionStats section shows that the query executor examined 2 index keys and 2 documents, and the total number of documents returned from this query was 2:

        "executionStats" : {
            "executionSuccess" : true,
            "nReturned" : 5,
            "executionTimeMillis" : 3,
            "totalKeysExamined" : 5,
            "totalDocsExamined" : 5,
    

    (My laptop is a little underpowered so it actually took 3 milliseconds for that query)