Search code examples
neo4jcypherneo4jrestclient

Neo4J: How can I find if a path traversing multiple nodes given in a list exist?


I have a graph of nodes with a relationship NEXT with 2 properties sequence (s) and position (p). For example:

N1-[NEXT{s:1, p:2}]-> N2-[NEXT{s:1, p:3}]-> N3-[NEXT{s:1, p:4}]-> N4 

A node N might have multiple outgoing Next relationships with different property values.

Given a list of node names, e.g. [N2,N3,N4] representing a sequential path, I want to check if the graph contains the nodes and that the nodes are connected with relationship Next in order.

For example, if the list contains [N2,N3,N4], then check if there is a relationship Next between nodes N2,N3 and between N3,N4.

In addition, I want to make sure that the nodes are part of the same sequence, thus the property s is the same for each relationship Next. To ensure that the order maintained, I need to verify if the property p is incremental. Meaning, the value of p in the relationship between N2 -> N3 is 3 and the value p between N3->N4 is (3+1) = 4 and so on.

I tried using APOC to retrieve the possible paths from an initial node N using python (library: neo4jrestclient) and then process the paths manually to check if a sequence exists using the following query:

q = "MATCH (n:Node) WHERE n.name = 'N' CALL apoc.path.expandConfig(n {relationshipFilter:'NEXT>', maxLevel:4}) YIELD path RETURN path"

results = db.query(q,data_contents=True)

However, running the query took some time that I eventually stopped the query. Any ideas?


Solution

  • This one is a bit tough.

    First, pre-match to the nodes in the path. We can use the collected nodes here to be a whitelist for nodes in the path

    Assuming the start node is included in the list, a query might go like:

    UNWIND $names as name
    MATCH (n:Node {name:name})
    WITH collect(n) as nodes
    WITH nodes, nodes[0] as start, tail(nodes) as tail, size(nodes)-1 as depth
    CALL apoc.path.expandConfig(start, {whitelistNodes:nodes, minLevel:depth, maxLevel:depth, relationshipFilter:'NEXT>'}) YIELD path
    WHERE all(index in range(0, size(nodes)-1) WHERE nodes[index] = nodes(path)[index])
    // we now have only paths with the given nodes in order
    WITH path, relationships(path)[0].s as sequence
    WHERE all(rel in tail(relationships(path)) WHERE rel.s = sequence)
    // now each path only has relationships of common sequence
    WITH path, apoc.coll.pairsMin([rel in relationships(path) | rel.p]) as pairs
    WHERE all(pair in pairs WHERE pair[0] + 1 = pair[1])
    RETURN path