Search code examples
neo4jdifferenceneo4j-boltneo4j-driver

What is difference between `result_available_after` and `result_consumed_after` in `BoltStatementResultSummary`? How to measure query execution time?


I tried understanding from the docs here, quoting from the doc

result_available_after = None The time it took for the server to have the result available.

result_consumed_after = None The time it took for the server to consume the result.

I still do not understand the actual difference. Which one I should consider if I want to scale the program and find the execution time of the query. And why result_available_after becomes 0 ms if I run the same query one more time? Is it because of cache? I tried changing setting dbms.memory.pagecache.size=1M as suggested here, but didn't work. How can I measure only the execution time of the query?

I am using Neo4j 4.0 and neo4j python driver for queries.


Solution

  • result_available_after is the time before the first result was found and made available. There may be many more results that still need to be found and streamed from the query, but this is reasonably when a client can begin getting results streamed back.

    result_consumed_after is the time at which all results were found and consumed by / sent to the client that made the query. The query is completely finished at this point.

    As for seeing result_available_after go to 0, that's likely a combination of two things:

    1. Since you just ran the query previously, the query has been cached, and so for subsequent executions there is no need to compile and plan, the plan is just retrieved from the cache and executed. This factors out compile and planning time.

    2. If the query was on the same data, then the parts of the graph touched and retrieved should now be in the pagecache. On subsequent executions, it will be faster to traverse since for traversal it only has to hit the pagecache. So this factors out disk I/O. This is why it is ideal to have enough RAM to allow pagecache to be configured to cover as much of the graph as possible, to avoid the costs of disk I/O.