DocumentDB ignores indexes of any field instead of sorted
db.requests.aggregate([
{ $match: {'DeviceId': '5f68c9c1-73c1-e5cb-7a0b-90be2f80a332'}},
{ $sort: { 'Timestamp': 1 } }
])
Useful information:
> explain('executionStats')
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "admin_portal.requests",
"winningPlan" : {
"stage" : "IXSCAN",
"indexName" : "Timestamp_1",
"direction" : "forward"
}
},
"executionStats" : {
"executionSuccess" : true,
"executionTimeMillis" : "398883.755",
"planningTimeMillis" : "0.274",
"executionStages" : {
"stage" : "IXSCAN",
"nReturned" : "20438",
"executionTimeMillisEstimate" : "398879.028",
"indexName" : "Timestamp_1",
"direction" : "forward"
}
},
"serverInfo" : {
...
},
"ok" : 1.0,
"operationTime" : Timestamp(1622585939, 1)
}
> db.requests.getIndexKeys()
[
{
"_id" : 1
},
{
"Timestamp" : 1
},
{
"DeviceId" : 1
}
]
It works fine when I query documents without sorting or when I use find and sort function instead of aggregation.
Important note: Also it works perfect on original MongoDB instance, but not on the DocumentDB
This is more of "how does DocumentDB choose a query plan" kind of question. There are many answers on how Mongo does it on stackoverflow.
Clearly choosing the "wrong" index can happen from failed trials based on data distribution, the issue here is that DocumentDB adds an unknown layer.
Amazon DocumentDB emulates the MongoDB 4.0 API on a purpose-built database engine that utilizes a distributed, fault-tolerant, self-healing storage system. As a result, query plans and the output of explain() may differ between Amazon DocumentDB and MongoDB. Customers who want control over their query plan can use the $hint operator to enforce selection of a preferred index.
They state that due to this layer differences might happen.
So now that we understand why a wrong index is selected ( kind of ). what can we do? well unless you want to drop or rebuilt your indexes differently somehow then you need to use the hint options for your pipeline.
db.collection.aggregate(pipeline, {hint: "index_name"})