Search code examples
jqueryperformancegremlinjanusgraph

Query improvement - gremlin #2


Need a help regarding query performance for gremlin as I am trying retrieve all notifications ( vertex ) which is having type as 1. Indexed property is receiver which is mixed.

In the below given query, the round time is approx 400ms

gremlin> g.V().has('receiver','3145912').has('type','1').profile()
==>Traversal Metrics Step                                                               Count  Traversers       Time (ms)    % Dur
============================================================================================================= JanusGraphStep([],[receiver.eq(3145912), type.e...                    94          94         504.393   100.00
    \_condition=(receiver = 3145912 AND type = 1)
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=[(type = 1)]:byTypeMixed
    \_index=byTypeMixed
    \_index_impl=search   optimization                                                                                 0.009   optimization                                                                                 0.189   backend-query                                                    16538                     494.729
    \_query=byTypeMixed:[(type = 1)]:byTypeMixed   backend-query                                                       95                       9.352
    \_query=byReceiverMixed:[(receiver = 3145912)]:byReceiverMixed
                                            >TOTAL                     -           -         504.393        -

Now if I run the above query removing the second condition check it's round time is 3ms ,

gremlin> g.V().has('receiver','3145912').profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[receiver.eq(3145912)])                             95          95           3.735   100.00
    \_condition=(receiver = 3145912)
    \_orders=[]
    \_isFitted=true
    \_isOrdered=true
    \_query=[(receiver = 3145912)](2000):byReceiverMixed
    \_index=byReceiverMixed
    \_index_impl=search
  optimization                                                                                 0.012
  optimization                                                                                 0.283
  backend-query                                                       95                       3.367
    \_query=byReceiverMixed:[(receiver = 3145912)](2000):byReceiverMixed
    \_limit=2000
                                            >TOTAL                     -           -           3.735        -

Is there a way we could improve the first query ?

Thanks


Solution

  • You see from the profile that the additional 'type' filter requires JanusGraph to retrieve 16.000 identifiers from the indexing backend. Possibly it is faster to retrieve the 'type' values from the storage backend, using :

    g.V().has('receiver','3145912').where(values('type').is('1'))
    

    If filtering vertices on both receiver and type properties is an often recurring query pattern, it is also possible to define an index on both properties (when using an indexing backend with JanusGraph).