Search code examples
mongodbmongodb-queryaggregation-frameworkmongodb-explain

In the mongodb explain, does it reveal the number of documents fetched from disk vs ram?


Let's suppose that a query examined 1 million records and returned 1000 of them.

In the explain of the query, does it reveal how many of the 1 million records examined needed to be fetched from SSD/HDD and how many of the 1000 returned docs needed to be fetched from SSD/HDD?

If so, which properties reveal this information?


Solution

  • No, for several reasons:

    • explain does not have access to that information.
      The mongod server is split into 2 discreet layers - database logic and storage engine. When a process in the database layer needs a document, it fetches it from the storage layer. The storage layer will pull it from cache if it is there, otherwise it requests it from the disk. Note that the operating system also maintains a cache, so the request from disk may be serviced from the filesystem cache instead of reading from disk. When the storage layer returns the document to the database layer, it does not indicate where it was fetched from.
    • with default options, explain does not fully execute the query.
      The default is to evaluate each candidate plan for a short period, and select the plan the produces the most results for the least amount of work during the test. Unless you specify the "executionStats" or "allPlansExecution" options, none of the plans are fully executed.
    • with execution stats option, the other plans are still tested, so threads other than the winning plan may have already pulled documents into the cache.
      When explain tests each plan it is partially executed, which will involve reading indexes and fetching documents, so there will always be something in the cache before the full execution begins.

    That information is also not static, so it would not be especially helpful if it were included in the explain output. Even from one execution of explain to the next what is already in the cache can change.

    You can use database profiling to monitor or log query execution so that you can see how much time various executions of the query spend waiting on cache or disk.