Search code examples
mongodbmongodb-querymongodb-indexes

Mongodb index, should partialfilters also be included in the index?


I have a query like this:

collection.find({
  type: "person",
  image: {
    $exists: true
  }
}, {
  sort: [
    ["age", -1]
  ],
  limit: 9,
)

Do I need to include the fields in the index if I already have them filtered?

collection.createIndex(
  {type: 1, image: 1, age: -1}, 
  { 
    partialFilterExpression: {
      type: 'person',
      image: {
        $exists: true
      }
    },
    background: true
  }
)

Or would it already know by the partial filter expression and I'm only querying by what's not fixed?

collection.createIndex(
  {age: -1}, 
  { 
    partialFilterExpression: {
      type: 'person',
      image: {
        $exists: true
      }
    },
    background: true,
    name: "ageIndex"
  }
)

Thinking about it, perhaps the right thing to do is to not index all fields but to use hint to force the db to use the index?

collection.find({
  type: "person",
  image: {
    $exists: true
  }
}, {
  sort: [
    ["age", -1]
  ],
  limit: 9,
).hint("ageIndex")

Would this actually use index correctly and ignore the other documents in the collection?


Solution

  • Consider a collection of sample documents being queried, as in the question:

    { _id: 1, type: "person", image: "i-1", age: 19, "fld": 12 },
    { _id: 2, type: "person", image: "i-2", age: 22, "fld": 121 },
    { _id: 3, type: "thing", image: "i-99", age: 29, "fld": 1212 },    // 'type' not 'person'
    { _id: 4, type: "person", age: 31, "fld": 12121 },                 // 'image' missing
    { _id: 5, type: "person", image: "i-3", age: 13, "fld": 121212 },
    { _id: 6, type: "person", age: 43, "fld": 1212121 },               // 'image' missing
    { _id: 7, type: "person", image: "i-4", age: 20, "fld": 1 }
    

    Do I need to include the fields in the index if I already have them filtered? Or would it already know by the partial filter expression and I'm only querying by what's not fixed?

    Create an index:

    db.collection.createIndex( { type: 1, age: -1 },
      {
         partialFilterExpression: {
           type: 'person',
           image: { $exists: true }
        }
      }
    )
    

    Note the index fields are on type and age only. Why? This is explained in the Index Usage Verification, below.

    The query:

    db.collection.find( { type: "person", image: { $exists: true } } ).sort( { age: -1 } )
    

    The result:

    The query returns the expected filtered documents and in the sorted order.

    { "_id" : 2, "type" : "person", "image" : "i-2", "age" : 22, "fld" : 121 }
    { "_id" : 7, "type" : "person", "image" : "i-4", "age" : 20, "fld" : 1 }
    { "_id" : 1, "type" : "person", "image" : "i-1", "age" : 19, "fld" : 12 }
    { "_id" : 5, "type" : "person", "image" : "i-3", "age" : 13, "fld" : 121212 }
    


    Index Usage Verification:

    The index usage can be verified by generating the query plan using the explain method:

    db.collection.find( { type: "person", image: { $exists: true } } ).sort( { age: -1 } ).explain()
    

    The plan output shows the index usage for both the filter and sort operations. This is noted as the IXSCAN (indexed scan) and the absence of the SORT stage in the plan. This is the correct usage of the index for this query.

    In the index definition, the two fields type + age specify a compound index. This is required that the index is applied on the type (for filtering) and age (for sorting). The field image cannot be specified in the index definition as it is not used with an equality condition (uses $exists); if specified the index will not be used for the following sorted field (from the documentation):

    An index can support sort operations on a non-prefix subset of the index key pattern. To do so, the query must include equality conditions on all the prefix keys that precede the sort keys.

    The query plan (part of it):

    {
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "test.persons",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "$and" : [
                                {
                                        "type" : {
                                                "$eq" : "person"
                                        }
                                },
                                {
                                        "image" : {
                                                "$exists" : true
                                        }
                                }
                        ]
                },
                "queryHash" : "25E877F5",
                "planCacheKey" : "C9D745BE",
                "winningPlan" : {
                        "stage" : "FETCH",
                        "filter" : {
                                "image" : {
                                        "$exists" : true
                                }
                        },
                        "inputStage" : {
                                "stage" : "IXSCAN",
                                "keyPattern" : {
                                        "type" : 1,
                                        "age" : -1
                                },
                                "indexName" : "type_1_age_-1",
                                "isMultiKey" : false,
                                "multiKeyPaths" : {
                                        "type" : [ ],
                                        "age" : [ ]
                                },
                                "isUnique" : false,
                                "isSparse" : false,
                                "isPartial" : true,
                                "indexVersion" : 2,
                                "direction" : "forward",
                                "indexBounds" : {
                                        "type" : [
                                                "[\"person\", \"person\"]"
                                        ],
                                        "age" : [
                                                "[MaxKey, MinKey]"
                                        ]
                                }
                        }
                }, ...