I'm working with a Neo4j database with a data model with GAP and SCENARIO nodes. Each GAP node has a "next" property that lists the keys of connected SCENARIO nodes. The SCENARIO nodes have a "parents" property indicating how many parent nodes they have. I want to filter SCENARIO nodes based on a list of selected GAP nodes and the number of parents.
Here is how GAP nodes are supposed to SCENARIO nodes:
Here is my GAP and SCENARIO nodes model:
(gap1:GAP {name: 'g1',
key: 'g1',
desc: '',
next: 's1,s2'
}),
//GAP2
(gap2:GAP {name: 'g2',
key: 'g2',
desc: '',
next: 's1,s2,s3'
}),
//GAP3
(gap3:GAP {name: 'g3',
key: 'g3',
desc: 'g3',
next: 's2'
}),
//SCENARIO
(scenario1:SCENARIO {name: 's1',
key: 's1',
DESC: '',
parents: 2
}),
(scenario2:SCENARIO {name: 's2',
key: 's2',
DESC: '',
parents: 3
}),
(scenario3:SCENARIO {name: 's3',
key: 's3',
DESC: '',
parents: 1}),
If a gap is provided as input in the query and it is a parent of a scenario, but the number of parents of that scenario is more than 1 in the parent node, it should exclude the scenario from the result. If two gaps are provided, and both are parents of a single scenario, other scenario nodes are included. Still, the parent condition only meets for that single node; it should only return that single node.
I have tried too many combinations but always fail somewhere. This query somehow worked, but it failed for the following conditions:
I have been stuck on this for around three days but have been unable to figure out a solution. Please note that, per the scenario, a direct link from GAP to SCENARIO will not solve the problem.
MATCH (gap:GAP)
WHERE gap.name IN ['g1', 'g2', 'g3']
WITH COLLECT(DISTINCT gap) AS selectedGaps
WHERE size(selectedGaps) > 0
MATCH (scenario:SCENARIO)
WHERE ANY(gap IN selectedGaps WHERE scenario.key IN split(gap.next, ','))
RETURN scenario.name AS ScenarioName;
You have inconsistencies in your sample data.
scenario2
says it has 3 parents, but there are only 2 GAP
nodes containing the s2
key.scenario3
says it has 1 parent, but there are actually 2 GAP
nodes containing the s3
key.Here is a query that should work, assuming you fix your data to be consistent:
MATCH (gap:GAP)
WHERE gap.name IN $gaps
UNWIND SPLIT(gap.next, ',') AS sKey
WITH sKey, COUNT(*) AS cnt
WITH COLLECT({sKey: sKey, cnt: cnt}) AS data
MATCH (s:SCENARIO)
WHERE ANY(d IN data WHERE d.sKey = s.key AND d.cnt = s.parents)
RETURN s.name AS ScenarioName;
The query assumes that the list of GAP
names is provided as a $gaps
parameter.
You should strongly consider changing your data model, however. You are not using any relationships, which is the main strength of a graph database. GAP
and SCENARIO
nodes should not store relationship information as properties, but should be linked by actual relationships. I would also recommend conforming to the recommended neo4j naming convention, where labels are camel-cased.
For example, this could conceptually be your new data model (without the next
and parents
properties):
(g:Gap {name, key, desc})-[:HAS_SCENARIO]->(s:Scenario {name, key, desc})
With this new model, the query for your use case would simply be:
MATCH (g:Gap)-[:HAS_SCENARIO]->(s:Scenario)
WHERE g.name IN $gaps
WITH s, COUNT(g) AS cnt
WHERE COUNT{ ()-[:HAS_SCENARIO]->(s) } = cnt
RETURN s.name AS scenarioName;