Search code examples
neo4jcypherneo4j-python-driver

Getting node/edge creation/removal statistics


I am running a Python (3.8) script which uses pip library neo4j 4.0.0 to interact with a community edition neo4j 4.1.1 server.

I am running many queries which use MERGE to update or create if nodes and relationships don't exist. So far this is working well, as the database is getting the data as intended.

From my script's side, I would like however to know how many nodes and edges were created in each query. The issue is that in these Cypher queries that I send to the database from my script, I call MERGE more than one time and also use APOC procedures (though the APOC ones are just for updating labels, they don't create entities).

Here is an example of a query:

comment_field_names: List[str] = list(threads[0].keys())
cypher_single_properties: List[str] = []
for c in comment_field_names:
    cypher_single_properties.append("trd.{0} = {1}.{0}".format(c, "trd_var"))
cypher_property_string: str = ", ".join(cypher_single_properties)


with driver.session() as session:
    crt_stmt = ("UNWIND $threads AS trd_var "

                "MERGE (trd:Thread {thread_id:trd_var.thread_id}) "
                "ON CREATE SET PY_REPLACE "
                "ON MATCH SET PY_REPLACE "

                "WITH trd AS trd "
                "CALL apoc.create.addLabels(trd, [\"Comment\"]) YIELD node "

                "WITH trd as trd "
                "MERGE (trd)-[r:THREAD_IN]->(Domain {domain_id:trd.domain_id}) "
                "ON CREATE SET r.created_utc = trd.created_utc "
                "ON MATCH SET r.created_utc = trd.created_utc "

                "RETURN distinct 'done' ")
    crt_params = {"threads": threads}

    # Insert the individual properties we recorded earlier.
    crt_stmt = crt_stmt.replace("PY_REPLACE", cypher_property_string)
    
    run_res = session.run(crt_stmt, crt_params)

This works fine and the nodes get created with the properties passed from the threads Dict which is passed through the variable crt_params to UNWIND.

However, the Result instance in run_res does not have any ResultSummary inside it with a SummaryCounters instance for me to access statistics of created nodes and relations. I suspect this is because of:

"RETURN distinct 'done' "

However, I am not sure if this is the reason.

Hoping someone may be able to help me set up my queries so that, no matter the number of MERGE operations I perform, I get the statistics for the whole query that was sent in crt_stmt.

Thank you very much.


Solution

  • When using the earlier neo4j version, you could write n = result.summary().counters.nodes_created, but from 4.0 the summary() method does not exist.

    Now I found from https://neo4j.com/docs/api/python-driver/current/breaking_changes.html that Result.summary() has been replaced with Result.consume(), this behaviour is to consume all remaining records in the buffer and returns the ResultSummary.

    You can get all counters by counters = run_res.consume().counters