I have this cypher query that I had written:
MATCH (SubjectUserNode:User:Transaction {NodeID: "2547:12109:000:381864"})-[dimensionRelation:LegalEntity
WHERE dimensionRelation.Status = "1"
]->(dimension:LegalEntity:Transaction)<-[udimensionRelation:LegalEntity
WHERE udimensionRelation.Status = "1"
]-(User:User:Transaction)-[RoleRelationship:Role
WHERE RoleRelationship.Status = "1"
]->(Role:Role:Transaction {NodeID: "2547:12122:000:70163"})
RETURN User.TransactionID as UserID
In this query, I'm trying to get all those User nodes that are related to a given Role node(NodeID given) and are also related to the LegalEntity node which has a relation to the given User node(NodeID given).
Example:
The query profile is:
The query is taking between 4-5 seconds to return the data and the number of nodes being returned is about 150.
Is there anyway to improve this query like by using any apoc procedure? I can't think of any other way as according to me this is a simple query in itself. Also, indices are already created on the nodeID of each node type.
Your query seems to be spending most of its time scanning through all the possible SubjectUserNode
candidates.
So, try adding an index on User.NodeID
. The execution planner should be able to use that index to reduce the execution time.
Also, unless logically required, you should simplify your query to remove unnecessary filtering, which wastes time. For example, if all User
nodes are also Transaction
nodes, use (SubjectUserNode:User ...)
instead of (SubjectUserNode:User:Transaction ...)
.