I have the following neptune query
g.V().hasLabel('User')
.has('user_id', 1004)
.repeat(both('USES_UPI','USES_ACCOUNT','USES_HARDWARE_ID','USES_GAID','HAS_COOKIES').simplePath().dedup())
.times(3)
.hasLabel('Gaid')
.dedup()
.count()
and it's taking too much time.
I tried profiling the query
Original Traversal
==================
[GraphStep(vertex,[]), HasStep([~label.eq(User), user_id.eq(159017810)]), RepeatStep([VertexStep(BOTH,[USES_UPI, USES_ACCOUNT, USES_HARDWARE_ID, USES_GAID, HAS_COOKIES],vertex), PathFilterStep(simple,null,null), DedupGlobalStep(null,null), RepeatEndStep],until(loops(3)),emit(false)), HasStep([~label.eq(Gaid)]), DedupGlobalStep(null,null), CountGlobalStep]
Optimized Traversal
===================
Neptune steps:
[
NeptuneGraphQueryStep(Vertex) {
JoinGroupNode {
PatternNode[(?1, <user_id>, ?9, ?) . project distinct ?1 . ContainsFilter(?9 in (159017810^^<INT>, 159017810^^<LONG>, 1.59017808E8^^<FLOAT>, 1.5901781E8^^<DOUBLE>)) .], {estimatedCardinality=1, expectedTotalOutput=1, indexTime=0, joinTime=0, numSearches=1, actualTotalOutput=1}
PatternNode[(?1, <~label>, ?2=<User>, <~>) . project ask .], {estimatedCardinality=1327714, expectedTotalOutput=679, actualTotalOutput=379683, indexTime=0, joinTime=0, numSearches=1}
RepeatNode {
Repeat {
JoinGroupNode {
UnionNode {
PatternNode[(?3, ?6, ?4, ?7) . project ?3,?4 . IsEdgeIdFilter(?7) . ContainsFilter(?6 in (<USES_UPI>, <USES_ACCOUNT>, <USES_HARDWARE_ID>, <USES_GAID>, <HAS_COOKIES>)) .], {cacheJoin=true, estimatedCardinality=1923966, indexTime=354, joinTime=26212, numSearches=379683}
PatternNode[(?4, ?6, ?3, ?7) . project ?3,?4 . IsEdgeIdFilter(?7) . ContainsFilter(?6 in (<USES_UPI>, <USES_ACCOUNT>, <USES_HARDWARE_ID>, <USES_GAID>, <HAS_COOKIES>)) .], {cacheJoin=true, estimatedCardinality=1923966, indexTime=356, joinTime=19633, numSearches=379683}
}, annotations={estimatedCardinality=3847932}
SimplePathFilter(?1, ?4)) .
}
}
LoopsCondition {
LoopsFilter(?3,eq(3))
}
}, annotations={emitFirst=false, untilFirst=false, repeatMode=BFS, dedup=true}
}, annotations={path=[Vertex(?1):GraphStep, Repeat[̶V̶e̶r̶t̶e̶x̶(̶?̶3̶)̶:̶G̶r̶a̶p̶h̶S̶t̶e̶p̶, Vertex(?4):VertexStep, ̶V̶e̶r̶t̶e̶x̶(̶?̶8̶)̶:̶V̶e̶r̶t̶e̶x̶S̶t̶e̶p̶]], joinStats=true, optimizationTime=2, maxVarId=10, executionTime=107826}
},
NeptuneTraverserConverterStep
]
+ not converted into Neptune steps: NeptuneHasStep([~label.eq(Gaid)]),
Neptune steps:
[
NeptuneMemoryTrackerStep
]
+ not converted into Neptune steps: DedupGlobalStep(null,null),CountGlobalStep,
WARNING: >> [NeptuneHasStep([~label.eq(Gaid)]), DedupGlobalStep(null,null)] << (or one of the children for each step) is not supported natively yet
Runtime (ms)
============
Query Execution: 107828.341
Traversal Metrics
=================
Step Count Traversers Time (ms) % Dur
-------------------------------------------------------------------------------------------------------------
NeptuneGraphQueryStep(Vertex) 743022 743022 52892.767 49.06
NeptuneTraverserConverterStep 743022 743022 8142.297 7.55
NeptuneHasStep([~label.eq(Gaid)]) 4138 4138 46695.267 43.32
DedupGlobalStep(null,null) 4138 4138 48.927 0.05
CountGlobalStep 1 1 22.215 0.02
>TOTAL - - 107801.475 -
Repeat Metrics
==============
Iteration Visited Output Until Emit Next
------------------------------------------------------
0 1 0 0 0 1
1 3 0 0 0 3
2 379679 0 0 0 379679
3 743022 743022 743022 0 0
------------------------------------------------------
1122705 743022 743022 0 379683
Predicates
==========
# of predicates: 26
Results
=======
Count: 1
Index Operations
================
Query execution:
# of statement index ops: 1,502,390
# of unique statement index ops: 1,502,390
Duplication ratio: 1.0
# of terms materialized: 162
Right now it's taking time in seconds, can it be optimized somehow?
Turns out that we are having super nodes that got accidentally inserted in the DB.
What I did was
1: Filtering nodes from a configurable list and not insert if their ids matched.
2: In the query added additional clause to filter out super node ids.
3: As we are getting concurrent requests we made our spring API async via CompletableFuture and gave it a dedicated thread-pool.
4: Instead of using Apache Tinkerpop driver, we opted for Amazon Neptune Rest API - https://docs.aws.amazon.com/neptune/latest/userguide/access-graph-gremlin-rest.html
Doing this it reduced the time from 1 minute to 80ms.