Search code examples
neo4jcypherneo4j-apoc

Neo4J Graph Cypher Query - Formula Relationship Traversal


This graph model represents BusinessRule nodes (formulas) and ChartField nodes (values). A BusinessRule node will have "USES" relationships with ChartField nodes that are parameters for the BusinessRule formula. The BusinessRule will have 1 "OUTPUTS" relationship to a ChartField node that is the target of the BusinessRule calculation.

I am attempting to return all ChartField and BusinessRule nodes that could either be impacted by a change in an in initial ChartField node value or would be required by a BusinessRule to perform the calculation. With this in mind, we would always start with a ChartField node.

I have the following cypher query:

MATCH (start:ChartField {Id: '2025-BUDGET-11058201'})
CALL apoc.path.expandConfig(start, {
    relationshipFilter: "USES|OUTPUTS>",
    labelFilter: ">BusinessRule|>ChartField",
    uniqueness: "NODE_GLOBAL"
}) YIELD path
RETURN path;

There are two types of node types

  1. ChartField
  2. BusinessRule

The query will always start by matching on a ChartField node. In this case it is matching on the "Tours Taken" ChartField node.

The Problem

I need the cypher to return any BusinessRule node connected to the initial ChartField node that is matched, and I need any ChartFiefld node that is either an OUTPUT or a USES relationship to the BusinessRule nodes.

I also need to continue to traverse the relationships, so that if an ChartField node is related to the initial BusinessRule node via a "OUTPUTS" relationship, but also is related to a different BusinessRule node via a "USES" relationship, that we continue to work down that dependency and find the next level of BusinessRules and ChartField nodes.

This query mostly accomplsihes this. But what I want to avoid: I don't need additional BusinessRule nodes or its connected ChartField nodes unless the relationship is "OUTPUTS".

You can see from the graph below where I have crossed out in red. Those nodes shouldn't be returned. While the "Contract %" node is necessary because I need it for the calculation, because that relationship to its BusinessRule node isn't "OUTPUTS", I don't need to find any other BusinessRule nodes that the "Contract %" ChartField node may be tied to.

I know this is a bit winded. But I want to start with that "Tours Taken" ChartField node, and return everything that is seen in the image below, except what i have crossed out.

Output from initial cypher


Solution

  • Although your apoc example returns the paths, you state that you just want the nodes, so I will go with that in the following.

    To match your rules, it is easiest to gather the BusinessRule nodes with a quantified path pattern in one MATCH clause, and then gather the ChartField nodes linked to those BusinessRules nodes in a second MATCH:

    MATCH (start:ChartField {Id: '2025-BUDGET-11058201'})<-[:USES]-(:BusinessRule)
          ( ()-[:OUTPUTS]->(:ChartField)<-[:USES]-(:BusinessRule) )* 
          (br)
    WITH DISTINCT br
    WITH [br] + 
        COLLECT {
            WITH br
            MATCH (br)-[:OUTPUTS|USES]-(cf:ChartField)
            RETURN cf 
        } AS nodes
    UNWIND nodes as node
    RETURN DISTINCT node
    

    The COLLECT is there to enable the UNWIND and RETURN DISTINCT to return each node once and in a separate row.