I have a case where I need to find nodes that do not contain (relate) all the required nodes.
My business logic is as follows:
* A Trajectory contains several Points.
* A Trajectory is complete when it has at least:
* ONE Point START
* ONE Point MIDDLE
* ONE Point FINISH
In the following example, I have 4 Trajectories
http://console.neo4j.org/?id=1fjeyl
One Trajectory is complete and the other three are incomplete.
How can I find all the trajectories that do not contain all the required points?
There's a few ways you can do this.
With this model, one way is you can collect the nodes per trajectory and use list predicates to only include trajectories where any of the required positions are missing:
MATCH (t:Trajectory)-[:CONTAINS]->(p)
WITH t, collect(DISTINCT p.pos) as pointPositions
WHERE size(pointPositions) < 3 OR any(required in ['START', 'END', 'MIDDLE'] WHERE NOT required IN pointPositions)
RETURN t
Note that if you refactor your model such that the position of a point is indicated by the relationship point instead, such as:
(:Trajectory)-[:HAS_START]->(:Point)
(:Trajectory)-[:HAS_END]->(:Point)
(:Trajectory)-[:HAS_MIDDLE]->(:Point)
Then your query will be a bit simpler and efficiency will improve (which will show the most gain when you have lots of trajectories, some with lots of connected nodes).
MATCH (t:Trajectory)
WHERE NOT (t)-[:HAS_START]->() OR NOT (t)-[:HAS_END]->() OR NOT (t)-[:HAS_MIDDLE]->()
RETURN t
With that modeling and this kind of query, we don't even have to expand out from trajectory nodes to get our answer, since a node knows what relationships are connected to it (by type and/or direction) and their count. It's very easy then to determine if certain relationship types are or aren't present.