I have a Mongo collection called Elements containing ~9 million documents. Each document has the following structure:
{
_id : "1",
Timestamp : Numberlong(12345),
Nationality : "ITA",
Value: 5
}
If I run the following query:
db.Elements.find({ Nationality: 'ITA' })
the query performs fast (a few milliseconds).
If, instead, I run the following query:
db.Elements.find({ Timestamp: 12345 })
the query is slow, in the order of magnitude of tens of seconds. Obviously, if I add an index on Timestamp
, the query runs much faster. Running the same query on the field Value
, which is of type Int32, runs as fast as the first query.
What I am trying to understand is: why would the second query (without index) perform significantly worse than the first? Does Mongo treat Int64 values differently than other values?
It turns out I was making a mistake.
I was using Robomongo to execute the queries; by default, Robomongo pages the results (the default page size is 50 items).
Because the Timestamp
field contains values that are almost always different, the query had to perform an almost-full scan before it could fill up and return one page. On the other hand, because the other fields contain values that have a limited range (the Value
field, although it is Int32, has a limited domain in my application) I was getting results quickly because I was only looking at the first page.
When I run the same queries without pages (e.g. by appending a count
or obtaining an execution plan) all the queries have poor performances without indexes.
Therefore, there doesn't seem to be any special treatment of Int64 values as opposed to other primitive types.