Search code examples
mongodbaggregation-framework

Why are 10 $in with small lists way faster then 1 $in with a large list


having a mongodb collection with around 20 million documents.

Each document has a property "userId". I want to select all documents, that are one of the userId's given in a list of userId's I have.

This list is around 2'000 userIds.

First approach was to just write an aggregation that first has a match including a $in:userIds.

This performance kind of "ok". It takes around 5 seconds to fulfill the request.

When I do split up the query into 10 queries where each query is doing 200 of the 2'000 userids, I end up having the result in around 700ms.

Why is splitting up the query that much faster?

I do have an index on the userId-field (actually it is a compound index, as I filter 2 different properties as well). I'm using MongoDb 8


As written in the comments, here some additional information:

The complete structure of a document:

{
   _id: ...
   learnerId: <string>
   type: <string>
   organizationId: <string>
   timestamp: ISODate
   payload: {
     learningPackage: {
        learningPackageId: <string>
     }
   }
}

This is my complete query:

db.getCollection("event").aggregate(
[
  {
    "$match": {
     "organizationId": "yyy",
  "type": "ttt",
  "payload.learningPackage.learningPackageId": "XXXf",
  "timestamp": {
    "$lt": new ISODate("2025-01-23T22:59:59.999Z"),
    "$gte": new ISODate("2021-10-12T22:00:00.000Z")
  },
      "learnerId": {
        "$in": [
          "LearnerId1",
          "LearnerId2",
          "LearnerId3",
          "LearnerId4",
          "LearnerId5"
        ]
      },
    }
  },
  {
    "$group": {
      "_id": "$learnerId",
      "n": {
        "$sum": 1
      }
    }
  }
]

)

The winning plan:

{
            "isCached" : false,
            "queryPlan" : {
                "stage" : "GROUP",
                "planNodeId" : 3.0,
                "inputStage" : {
                    "stage" : "PROJECTION_COVERED",
                    "planNodeId" : 2.0,
                    "transformBy" : {
                        "learnerId" : true,
                        "_id" : false
                    },
                    "inputStage" : {
                        "stage" : "IXSCAN",
                        "planNodeId" : 1.0,
                        "keyPattern" : {
                            "type" : 1.0,
                            "organizationId" : 1.0,
                            "payload.learningPackage.learningPackageId" : 1.0,
                            "timestamp" : -1.0,
                            "learnerId" : 1.0
                        },
                        "indexName" : "type_1_organizationId_1_payload.learningPackage.learningPackageId_1_timestamp_-1_learnerId_1",
                        "isMultiKey" : false,
                        "multiKeyPaths" : {
                            "type" : [

                            ],
                            "organizationId" : [

                            ],
                            "payload.learningPackage.learningPackageId" : [

                            ],
                            "timestamp" : [

                            ],
                            "learnerId" : [

                            ]
                        },
                        "isUnique" : false,
                        "isSparse" : false,
                        "isPartial" : false,
                        "indexVersion" : 2.0,
                        "direction" : "forward",
                        "indexBounds" : {
                            "type" : [
                                "..."
                            ],
                            "organizationId" : [
                                "..."
                            ],
                            "payload.learningPackage.learningPackageId" : [
                                "..."
                            ],
                            "timestamp" : [
                                "..."
                            ],
                            "learnerId" : [...]
                        }
                    }
                }
            },
            "slotBasedPlan" : {
                "slots" : "$$RESULT=s18 env: { s5 = IndexBounds(\"field #0['type']: [CollationKey(0x5175697a526573756c74), CollationKey(0x5175697a526573756c74)], field #1['organizationId']: [CollationKey(0x36313637303466663062\"...), s9 = Nothing, s13 = true }",
                "stages" : "[3] project [s18 = newBsonObj(\"_id\", s15, \"n\", s17)] \n[3] project [s17 = (convert ( s16, int32) ?: s16)] \n[3] group [s15] [s16 = count()] spillSlots[s14] mergingExprs[sum(s14)] \n[3] project [s15 = (s1 ?: null)] \n[1] branch {s13} [s1, s12] \n[s2, s4] [1] ixscan_generic s5 none s4 none none lowPriority [s2 = 4] @\"7299d749-3274-4950-8600-ccdb65d12b9a\" @\"type_1_organizationId_1_payload.learningPackage.learningPackageId_1_timestamp_-1_learnerId_1\" true \n[s3, s6] [1] nlj inner [] [s7, s8] \n    left \n        [1] project [s7 = getField(s10, \"l\"), s8 = getField(s10, \"h\")] \n        [1] unwind s10 s11 s9 false \n        [1] limit 1ll \n        [1] coscan \n    right \n        [1] ixseek s7 s8 none s6 none none [s3 = 4] @\"7299d749-3274-4950-8600-ccdb65d12b9a\" @\"type_1_organizationId_1_payload.learningPackage.learningPackageId_1_timestamp_-1_learnerId_1\" true \n"
            }
        }

Note 1: I've removed all the sensitive data. So the index-bounds were filled with the expected values.

Not 2: In the query the mentioned field is called "learnerId" not "userId".

Execution Stats:

{
        "executionSuccess" : true,
        "nReturned" : 1528.0,
        "executionTimeMillis" : 4590.0,
        "totalKeysExamined" : 2067367.0,
        "totalDocsExamined" : 0.0,
        "executionStages" : {
            "stage" : "project",
            "planNodeId" : 3.0,
            "nReturned" : 1528.0,
            "executionTimeMillisEstimate" : 4530.0,
            "opens" : 1.0,
            "closes" : 1.0,
            "saveState" : 233.0,
            "restoreState" : 233.0,
            "isEOF" : 1.0,
            "projections" : {
                "18" : "newBsonObj(\"_id\", s15, \"n\", s17) "
            },
            "inputStage" : {
                "stage" : "project",
                "planNodeId" : 3.0,
                "nReturned" : 1528.0,
                "executionTimeMillisEstimate" : 4530.0,
                "opens" : 1.0,
                "closes" : 1.0,
                "saveState" : 233.0,
                "restoreState" : 233.0,
                "isEOF" : 1.0,
                "projections" : {
                    "17" : "(convert ( s16, int32) ?: s16) "
                },
                "inputStage" : {
                    "stage" : "group",
                    "planNodeId" : 3.0,
                    "nReturned" : 1528.0,
                    "executionTimeMillisEstimate" : 4530.0,
                    "opens" : 1.0,
                    "closes" : 1.0,
                    "saveState" : 233.0,
                    "restoreState" : 233.0,
                    "isEOF" : 1.0,
                    "groupBySlots" : [
                        Long("15")
                    ],
                    "expressions" : {
                        "16" : "count() ",
                        "initExprs" : {
                            "16" : null
                        }
                    },
                    "mergingExprs" : {
                        "14" : "sum(s14) "
                    },
                    "usedDisk" : false,
                    "spills" : 0.0,
                    "spilledBytes" : 0.0,
                    "spilledRecords" : 0.0,
                    "spilledDataStorageSize" : 0.0,
                    "inputStage" : {
                        "stage" : "project",
                        "planNodeId" : 3.0,
                        "nReturned" : 1683098.0,
                        "executionTimeMillisEstimate" : 4269.0,
                        "opens" : 1.0,
                        "closes" : 1.0,
                        "saveState" : 233.0,
                        "restoreState" : 233.0,
                        "isEOF" : 1.0,
                        "projections" : {
                            "15" : "(s1 ?: null) "
                        },
                        "inputStage" : {
                            "stage" : "branch",
                            "planNodeId" : 1.0,
                            "nReturned" : 1683098.0,
                            "executionTimeMillisEstimate" : 4244.0,
                            "opens" : 1.0,
                            "closes" : 1.0,
                            "saveState" : 233.0,
                            "restoreState" : 233.0,
                            "isEOF" : 1.0,
                            "numTested" : 1.0,
                            "thenBranchOpens" : 1.0,
                            "thenBranchCloses" : 1.0,
                            "elseBranchOpens" : 0.0,
                            "elseBranchCloses" : 0.0,
                            "filter" : "s13 ",
                            "thenSlots" : [
                                Long("2"),
                                Long("4")
                            ],
                            "elseSlots" : [
                                Long("3"),
                                Long("6")
                            ],
                            "outputSlots" : [
                                Long("1"),
                                Long("12")
                            ],
                            "thenStage" : {
                                "stage" : "ixscan_generic",
                                "planNodeId" : 1.0,
                                "nReturned" : 1683098.0,
                                "executionTimeMillisEstimate" : 4234.0,
                                "opens" : 1.0,
                                "closes" : 1.0,
                                "saveState" : 233.0,
                                "restoreState" : 233.0,
                                "isEOF" : 1.0,
                                "indexName" : "type_1_organizationId_1_payload.learningPackage.learningPackageId_1_timestamp_-1_learnerId_1",
                                "keysExamined" : 2067367.0,
                                "seeks" : 384269.0,
                                "numReads" : 2067367.0,
                                "recordIdSlot" : 4.0,
                                "outputSlots" : [
                                    Long("2")
                                ],
                                "indexKeysToInclude" : "00000000000000000000000000010000"
                            },
                            "elseStage" : {
                                "stage" : "nlj",
                                "planNodeId" : 1.0,
                                "nReturned" : 0.0,
                                "executionTimeMillisEstimate" : 0.0,
                                "opens" : 0.0,
                                "closes" : 0.0,
                                "saveState" : 233.0,
                                "restoreState" : 233.0,
                                "isEOF" : 0.0,
                                "totalDocsExamined" : 0.0,
                                "totalKeysExamined" : 0.0,
                                "collectionScans" : 0.0,
                                "collectionSeeks" : 0.0,
                                "indexScans" : 0.0,
                                "indexSeeks" : 0.0,
                                "indexesUsed" : [
                                    "type_1_organizationId_1_payload.learningPackage.learningPackageId_1_timestamp_-1_learnerId_1"
                                ],
                                "innerOpens" : 0.0,
                                "innerCloses" : 0.0,
                                "outerProjects" : [

                                ],
                                "outerCorrelated" : [
                                    Long("7"),
                                    Long("8")
                                ],
                                "outerStage" : {
                                    "stage" : "project",
                                    "planNodeId" : 1.0,
                                    "nReturned" : 0.0,
                                    "executionTimeMillisEstimate" : 0.0,
                                    "opens" : 0.0,
                                    "closes" : 0.0,
                                    "saveState" : 233.0,
                                    "restoreState" : 233.0,
                                    "isEOF" : 0.0,
                                    "projections" : {
                                        "7" : "getField(s10, \"l\") ",
                                        "8" : "getField(s10, \"h\") "
                                    },
                                    "inputStage" : {
                                        "stage" : "unwind",
                                        "planNodeId" : 1.0,
                                        "nReturned" : 0.0,
                                        "executionTimeMillisEstimate" : 0.0,
                                        "opens" : 0.0,
                                        "closes" : 0.0,
                                        "saveState" : 233.0,
                                        "restoreState" : 233.0,
                                        "isEOF" : 0.0,
                                        "inputSlot" : 9.0,
                                        "outSlot" : 10.0,
                                        "outIndexSlot" : 11.0,
                                        "preserveNullAndEmptyArrays" : 0.0,
                                        "inputStage" : {
                                            "stage" : "limit",
                                            "planNodeId" : 1.0,
                                            "nReturned" : 0.0,
                                            "executionTimeMillisEstimate" : 0.0,
                                            "opens" : 0.0,
                                            "closes" : 0.0,
                                            "saveState" : 233.0,
                                            "restoreState" : 233.0,
                                            "isEOF" : 0.0,
                                            "inputStage" : {
                                                "stage" : "coscan",
                                                "planNodeId" : 1.0,
                                                "nReturned" : 0.0,
                                                "executionTimeMillisEstimate" : 0.0,
                                                "opens" : 0.0,
                                                "closes" : 0.0,
                                                "saveState" : 233.0,
                                                "restoreState" : 233.0,
                                                "isEOF" : 0.0
                                            }
                                        }
                                    }
                                },
                                "innerStage" : {
                                    "stage" : "ixseek",
                                    "planNodeId" : 1.0,
                                    "nReturned" : 0.0,
                                    "executionTimeMillisEstimate" : 0.0,
                                    "opens" : 0.0,
                                    "closes" : 0.0,
                                    "saveState" : 233.0,
                                    "restoreState" : 233.0,
                                    "isEOF" : 0.0,
                                    "indexName" : "type_1_organizationId_1_payload.learningPackage.learningPackageId_1_timestamp_-1_learnerId_1",
                                    "keysExamined" : 0.0,
                                    "seeks" : 0.0,
                                    "numReads" : 0.0,
                                    "recordIdSlot" : 6.0,
                                    "outputSlots" : [
                                        Long("3")
                                    ],
                                    "indexKeysToInclude" : "00000000000000000000000000010000",
                                    "seekKeyLow" : "s7 ",
                                    "seekKeyHigh" : "s8 "
                                }
                            }
                        }
                    }
                }
            }
        }
    }

This is the query executed with ~1600 learner Ids. If I run it with around 180 learnerIds, the number of keys examined reduces to 402809 and the query only takes around 383ms. The rest of the plan is identical.

The compound index is defined as this: type_1_organizationId_1_payload.learningPackage.leagningPackageId_1_learnerId_1_timestamp_-1


Solution

  • Unless I've seriously misunderstood something about your question, I think the answer is parallelism.

    At the beginning you mention that:

    • A single query takes ~5 seconds of wall clock time
    • 10 queries that each do ~10% of the work give you "the result" in ~0.7 seconds

    When I first read that second part I had assumed you meant that the requests were sequential. But now rereading (given the additional context below) I realize that you actually don't specify. You also don't specify what "the result" means when you're running 10 queries or if you're submitting them in parallel. I'm assuming now that these queries are being submitted in parallel and you see approximately 0.7 seconds of total wall clock duration to get the results (and perhaps finish combining them client side).

    If you submitted the operations in parallel, then they could each take 0.5 seconds of wall clock time and that would perfectly mirror the 10% of work that each is doing. The total amount of execution time is the same (0.5s x 10 = 5s), we're just changing it to be done concurrently rather than sequentially. This fits pretty nicely into the 0.7 seconds that you referenced for that approach.

    Apart from sharding, MongoDB presently does not have any intraquery parallelism capabilities as far as I'm aware. So your large single query will run in a single thread and the totality of its execution will manifest as elapsed wall clock time.


    In the additional information, you've given us these critical pieces of information:

    List size Keys examined Duration Keys per millisecond
    ~1,600 2,067,367 4.590 seconds 450 keys/ms
    180 (11.25%) 402,809 (19.48%) 0.383 seconds (8.34%) 1051 keys/ms

    While this single comparison does support the claim that the smaller query is disproportionately faster, it is only marginally so. The original question implied that sequential execution of 10 smaller queries ran 86% faster than the combined query. But these later numbers suggest that the wall combined time of the smaller queries would be along the lines of:

    • 1.966 seconds (processing ~2M keys at ~1k per ms)
    • 3.404 seconds (using that 383ms duration for each 11.25% portion of the list size)

    Again, faster but within the same order of magnitude and much closer than how I originally read the description. It's also important to emphasize the fact that this is a single comparison point - any number of factors can influence it and therefore the conclusions we draw from it. Even things like the amount of seeking in the index that this smaller list of 180 items had to do may or may not be proportional to the seeks needed overall (384,269), and if it is a lower portion then this "subquery" would misrepresent the amount of "work" that will be needed overall.

    In general I'd also put out a reminder here that there can be more overhead when computing larger result sets. The big thing to be aware of would be if the operation had to spill to disk, but the execution stats that you provided for the larger operation directly show that this is not a factor at the moment.