I have created a graph in neo4j community edition running in a docker container in my local using below cypher.
create (a:Account{name:'account.a'}),
(b:Account{name:'account.b'}),
(c:Account{name:'account.c'}),
(d:Account{name:'account.d'}),
(e:Account{name:'account.e'}),
(f:Account{name:'account.f'}),
(x:Account{name:'account.x'}),
(y:Account{name:'account.y'}),
(cas:Calculation{name:'account.a:simple_interest'}),
(cac:Calculation{name:'account.a:compound_interest'}),
(cbs:Calculation{name:'account.b:simple_interest'}),
(cbc:Calculation{name:'account.b:compound_interest'}),
(cxs:Calculation{name:'account.x:simple_interest'}),
(cxc:Calculation{name:'account.x:compound_interest'}),
(a)-[:RESULTS_FROM]->(cas),
(a)-[:RESULTS_FROM]->(cac),
(b)-[:RESULTS_FROM]->(cbs),
(b)-[:RESULTS_FROM]->(cbc),
(x)-[:RESULTS_FROM]->(cxs),
(x)-[:RESULTS_FROM]->(cxc),
(cas)-[:DEPENDS_ON]->(d),
(cas)-[:DEPENDS_ON]->(e),
(cbs)-[:DEPENDS_ON]->(c),
(cbc)-[:DEPENDS_ON]->(y),
(cac)-[:DEPENDS_ON]->(f),
(cac)-[:DEPENDS_ON]->(d),
(cxs)-[:DEPENDS_ON]->(d),
(cxc)-[:DEPENDS_ON]->(d)
There are two kinds of nodes in the graph - Account and Calculation. Account node has a "RESULTS_FROM" relation to Calculation node with the same name + interest type. A Calculation node has a "DEPENDS_ON" relation with 'n' number of other Account nodes.
Below is how the graph looks like.
I need to write a cypher which starts at an Account node and finds all it's dependencies in the entire graph. A dependency in this context is the Calculation node(s) it "RESULTS_FROM" and the Account nodes that those Calculation node(s) "DEPENDS_ON". It should do this recursively in all direction until it finds an Account node which doesn't have any "RESULTS_FROM" relation to any other Calculation nodes.
In the snapshot, if I start at "account.b", it will follow both "RESULTS_FROM" relation (via blue Calculation nodes) which then point to "account.y" and "account.c" via "DEPENDS_ON".
So the output should be - account.y and account.c
Similarly, the output for "account.x" should be "account.d". We stop when any Account doesn't "RESULT_FROM" any Calculation.
I tried different things like below but I am not sure what's the ideal way to do this.
MATCH (startNode:YourLabel {name: 'account.a'})
CALL {
WITH [startNode] AS nodes, [] AS rels
UNWIND nodes AS node
FOREACH(ignoreMe IN [1] |
MATCH (node)-[r]->(relatedNode)
WHERE NOT relatedNode IN nodes
SET nodes = nodes + relatedNode
SET rels = rels + r
)
}
RETURN DISTINCT nodes AS reachableNodes
ps- the data here is just a sample and in reality, the graph will have 100s of Account and Calculation nodes. The Relations between the nodes remain the same though. And no Calculation node will depend on a Account node with the same name + interest_type
One option is using APOC
. One of the advantages is that it easily gives you only the "end" nodes that you want:
MATCH(n:Account{name:'account.b'})
CALL apoc.path.subgraphAll(n, {
relationshipFilter:"RESULTS_FROM>|DEPENDS_ON>",
labelFilter: ">Account"
})
YIELD nodes
RETURN nodes
returns:
╒════════════════════════════════════════════════════════════════╕
│nodes │
╞════════════════════════════════════════════════════════════════╡
│[(:Account {name: "account.y"}), (:Account {name: "account.c"})]│
└────────────────────────────────────────────────────────────────┘
EDIT:
relationshipFilter
was changed from >
to RESULTS_FROM>|DEPENDS_ON>
according to @cybersam's remark