Search code examples
pythonneo4jcypherneo4j-apoc

A query to find or create relationship and increment its count property by one


I want to count "the number of times a node is connected to another node".

It seemed a bad idea to have multiple redundant relationships between the two same nodes, so I thought I should use a single relationship with a "count" property for this purpose.

But how to update those relationships and increment the property "count"?

Neo4j doesn't have any auto-increment function for properties (according to what I read).

As an alternative, I wanted to MATCH the existing relationship (if it exists), and then update its count property, in Python, but nothing seems to work.

Here is what I have so far:

def create_or_increment_relationship(tx, start_node_label, start_node_property, new_node_label, new_node_property, relationship_type):

    return tx.run(

        "MATCH (s) WHERE (s.label = $start_node_label AND s.name = $start_node_property) \n \
        CALL apoc.merge.node($new_node_label, {name: $new_node_property}) \n \
        MATCH (s)-[r:$relationship_type]->(n) \n \
        CALL apoc.merge.relationship(s, $relationship_type, r.count+1, n);",

        start_node_label=start_node_label,
        start_node_property=start_node_property,
        new_node_label=new_node_label,
        new_node_property=new_node_property,
        relationship_type=relationship_type,

    ).single().value()

The query above (in green) has 4 lines:

  • find the start_node (s, which should already exist)
  • find or create the new_node (n, which sometimes doesn't exist)
  • find the relationship between them (r, which doesn't exist if n doesn't exist)
  • if it exists, increment its "count" property by one, else set it to one (r.count)

I understand that I would also need to use APOC to MATCH an existing relationship based on a label established at runtime (dynamic label, i.e. a Python variable). (That's the third line in the query part of this code.)

However I wasn't able to find any example of such a query, or an apoc function that matches relationships on criteria.


UPDATE: so apparently a way to MATCH relationships in APOC is to use apoc.merge.relationship as described in this post https://github.com/neo4j-contrib/neo4j-apoc-procedures/issues/2674:

In the second example, I'm using apoc.merge.node to MATCH the start and end nodes without actually creating or changing them. My assumption is though, that Neo4j still write-locks these nodes which affects the performance and would hinder optimisations such as parallelisation via apoc.periodic.iterate due to deadlocks on the same node.

... To the maintainers of APOC: I do agree it would be nice if there was a apoc.match.relationship (or node) function. It would be nice I think for ease of use / understanding / clarity of the API; and this poster also says that a dedicated implementation could bring performance benefits in some cases.


Solution

  • Things to consider:

    1. The statement below DOES NOT work when using a parameterized query. So your query is not returning data after this statement. Thus, no relationship will be created in the last line

    MATCH (s)-[r:$relationship_type]->(n)

    Fix: You need to do a string replace using python when the variable is a relationship type.

    1. apoc.merge.relationship will not create a relationship if the previous MATCH does NOT return a row. The query will simply end without running the last apoc statement.

    Fix: use a MERGE on CREATE on MATCH instead

    1. Besides there are typo errors on your query so I suggest you debug the cypher query by copying the actual statement you generated then run it in neo4j console.

    Fix: look at line #1 and line #2 of my query.

    Below is a working snippet of code that you can use.

    from neo4j import GraphDatabase 
    
    uri="bolt://localhost:7687"
    user="neo4j"
    password="<awesome_psw>"
    
    driver = GraphDatabase.driver(uri, auth=(user, password))
    session = driver.session()
    
    start_node_label = 'Person'
    start_node_property = 'user' 
    new_node_label = 'NewPerson'
    new_node_property = 'newuser'
    relationship_type = 'RELATED_TO'
    
    cypher = "MATCH (s) WHERE $start_node_label in labels(s) AND s.name = $start_node_property \n \
              CALL apoc.merge.node([$new_node_label], {{name: $new_node_property}}) YIELD node as n \n \
              MERGE (s)-[r:{relationship_type}]->(n) \n \
                 ON CREATE set r.count = 1  \n \
                 ON MATCH set r.count = r.count + 1 \n \
              RETURN r;"
    query  = cypher.format(relationship_type = relationship_type)
    result = session.run(query,
            start_node_label=start_node_label,
            start_node_property=start_node_property,
            new_node_label=new_node_label,
            new_node_property=new_node_property,
            relationship_type=relationship_type)
    d = result.single().value() 
    print(d)
    

    Sample result:

    <Relationship id=7 nodes=(<Node id=8 labels=frozenset() properties={}>, <Node id=7 labels=frozenset() properties={}>) type='HAS' properties={'count': 1}>