Search code examples
graphgremlintinkerpop3janusgraph

We met the query performance issue. While we run the query JanusGraph Hooked


We have the below Graph Data below. only 5 Vertexes but we have many Edges. How can I handle below Querys?I would like to get the path from one node to one other. Or I just want to get the cycle path.

suresh       = graph.addVertex(label,'person','uuid','7bff1bc0-cef1-1033-8f28-d99da6cfd8a9')
robin_niu    = graph.addVertex(label,'person','uuid','e3348740-d37f-1031-8b5b-89fbb6fdad64')
hujunjie     = graph.addVertex(label,'person','uuid','5e5139c0-dbe7-102e-8780-bedba724cbf7')
clintpollitt = graph.addVertex(label,'person','uuid','d92c6340-f98b-1035-85d7-bee5d5cc5ebe')
yanjuqi      = graph.addVertex(label,'person','uuid','2d9fba40-74c7-1033-8e84-d3a6c90ad2e9')

suresh.addEdge('Communication',robin_niu,'date','2017-11-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',robin_niu,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',robin_niu,'date','2017-11-03T00:00:00Z','weight',1)
suresh.addEdge('Communication',robin_niu,'date','2017-11-04T00:00:00Z','weight',1)
suresh.addEdge('Communication',robin_niu,'date','2017-11-05T00:00:00Z','weight',1)
suresh.addEdge('Communication',hujunjie,'date','2017-12-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',hujunjie,'date','2017-12-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',hujunjie,'date','2017-12-03T00:00:00Z','weight',1)
suresh.addEdge('Communication',hujunjie,'date','2017-12-04T00:00:00Z','weight',1)
suresh.addEdge('Communication',hujunjie,'date','2017-12-05T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-10-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-10-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-10-03T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-10-04T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-10-05T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-01-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-01-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-01-03T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-01-04T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-01-05T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-11-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-11-03T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-11-04T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-11-05T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-12-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-12-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-12-03T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-12-04T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-12-05T00:00:00Z','weight',1)
suresh.addEdge('Communication',yanjuqi,'date','2017-12-05T00:00:00Z','weight',1)
suresh.addEdge('ProfilesReportingToChain',clintpollitt,'date','2016-12-01T00:00:00Z','weight',5)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-01T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-02T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
suresh.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)

clintpollitt.addEdge('Communication',robin_niu,'date','2017-12-01T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',robin_niu,'date','2017-12-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',robin_niu,'date','2017-12-03T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',hujunjie,'date','2017-11-01T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',hujunjie,'date','2017-12-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',hujunjie,'date','2017-10-03T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',hujunjie,'date','2017-12-04T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',hujunjie,'date','2017-12-05T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',yanjuqi,'date','2017-10-01T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',yanjuqi,'date','2017-10-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',yanjuqi,'date','2017-10-03T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',yanjuqi,'date','2017-10-04T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',yanjuqi,'date','2017-10-05T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',yanjuqi,'date','2017-01-01T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',yanjuqi,'date','2017-01-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',yanjuqi,'date','2017-01-03T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',yanjuqi,'date','2017-01-04T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',yanjuqi,'date','2017-01-05T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',yanjuqi,'date','2017-11-01T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',yanjuqi,'date','2017-11-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',yanjuqi,'date','2017-11-03T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',yanjuqi,'date','2017-11-04T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',yanjuqi,'date','2017-11-05T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-01T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-01T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-01T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-01T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-01T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-02T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-11T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-11T00:00:00Z','weight',1)
clintpollitt.addEdge('Communication',suresh,'date','2017-11-11T00:00:00Z','weight',1)

yanjuqi.addEdge('Communication',robin_niu,'date','2017-12-11T00:00:00Z','weight',1)
yanjuqi.addEdge('Communication',robin_niu,'date','2017-12-12T00:00:00Z','weight',1)
yanjuqi.addEdge('Communication',robin_niu,'date','2017-12-13T00:00:00Z','weight',1)
yanjuqi.addEdge('Communication',hujunjie,'date','2017-11-11T00:00:00Z','weight',1)
yanjuqi.addEdge('Communication',hujunjie,'date','2017-12-12T00:00:00Z','weight',1)
yanjuqi.addEdge('Communication',hujunjie,'date','2017-10-13T00:00:00Z','weight',1)
yanjuqi.addEdge('Communication',hujunjie,'date','2017-12-14T00:00:00Z','weight',1)
yanjuqi.addEdge('Communication',hujunjie,'date','2017-12-15T00:00:00Z','weight',1)
yanjuqi.addEdge('Communication',clintpollitt,'date','2017-10-01T00:00:00Z','weight',1)
yanjuqi.addEdge('Communication',clintpollitt,'date','2017-10-02T00:00:00Z','weight',1)
yanjuqi.addEdge('Communication',clintpollitt,'date','2017-10-03T00:00:00Z','weight',1)
yanjuqi.addEdge('Communication',clintpollitt,'date','2017-10-04T00:00:00Z','weight',1)
yanjuqi.addEdge('Communication',suresh,'date','2017-10-05T00:00:00Z','weight',1)
yanjuqi.addEdge('Communication',suresh,'date','2017-01-01T00:00:00Z','weight',1)

robin_niu.addEdge('Communication',hujunjie,'date','2017-12-11T00:00:00Z','weight',1)
robin_niu.addEdge('Communication',hujunjie,'date','2017-12-12T00:00:00Z','weight',1)
robin_niu.addEdge('Communication',hujunjie,'date','2017-12-13T00:00:00Z','weight',1)
robin_niu.addEdge('Communication',hujunjie,'date','2017-11-11T00:00:00Z','weight',1)
robin_niu.addEdge('Communication',hujunjie,'date','2017-12-12T00:00:00Z','weight',1)
robin_niu.addEdge('Communication',hujunjie,'date','2017-10-13T00:00:00Z','weight',1)
robin_niu.addEdge('Communication',yanjuqi,'date','2017-12-14T00:00:00Z','weight',1)
robin_niu.addEdge('Communication',yanjuqi,'date','2017-12-15T00:00:00Z','weight',1)
robin_niu.addEdge('Communication',clintpollitt,'date','2017-10-01T00:00:00Z','weight',1)
robin_niu.addEdge('Communication',suresh,'date','2017-11-01T00:00:00Z','weight',1)
robin_niu.addEdge('Communication',suresh,'date','2017-01-01T00:00:00Z','weight',1)
robin_niu.addEdge('ProfilesReportingToChain',yanjuqi,'date','2017-10-02T00:00:00Z','weight',5)
robin_niu.addEdge('ProfilesColleague',hujunjie,'date','2017-10-03T00:00:00Z','weight',2)

hujunjie.addEdge('Communication',robin_niu,'date','2017-12-11T00:00:00Z','weight',1)
hujunjie.addEdge('Communication',robin_niu,'date','2017-12-12T00:00:00Z','weight',1)
hujunjie.addEdge('Communication',clintpollitt,'date','2017-12-13T00:00:00Z','weight',1)
hujunjie.addEdge('Communication',clintpollitt,'date','2017-11-11T00:00:00Z','weight',1)
hujunjie.addEdge('Communication',clintpollitt,'date','2017-12-12T00:00:00Z','weight',1)
hujunjie.addEdge('Communication',suresh,'date','2017-10-13T00:00:00Z','weight',1)
hujunjie.addEdge('Communication',suresh,'date','2017-11-01T00:00:00Z','weight',1)
hujunjie.addEdge('Communication',suresh,'date','2017-01-01T00:00:00Z','weight',1)
hujunjie.addEdge('Communication',yanjuqi,'date','2017-12-14T00:00:00Z','weight',1)
hujunjie.addEdge('Communication',yanjuqi,'date','2017-12-15T00:00:00Z','weight',1)
hujunjie.addEdge('Communication',yanjuqi,'date','2017-10-01T00:00:00Z','weight',1)
hujunjie.addEdge('ProfilesReportingToChain',yanjuqi,'date','2016-10-02T00:00:00Z','weight',5)
hujunjie.addEdge('ProfilesColleague',robin_niu,'date','2017-10-03T00:00:00Z','weight',2)

I would like to get the path from one node to one other.

below 2 queries can't work very well. How can I fix this issue?

g.V().has('uuid','e3348740-d37f-1031-8b5b-89fbb6fdad64').repeat(out()).until(has('uuid','2d9fba40-74c7-1033-8e84-d3a6c90ad2e9')).simplePath().path()


g.V().has('uuid','e3348740-d37f-1031-8b5b-89fbb6fdad64').as('v').
  repeat(outE().as('e').inV().as('v')).
    until(has('uuid','2d9fba40-74c7-1033-8e84-d3a6c90ad2e9')).
  store('a').
    by('uuid').
  store('a').
    by(select(all, 'v').unfold().values('uuid').fold()).
  store('a').
    by(select(all, 'e').unfold().
       store('x').
         by(union(values('weight'),
                  select('x').count(local)).fold()).
       cap('x').
       store('a').
         by(unfold().limit(local, 1).fold()).unfold().
       sack(assign).
         by(constant(1d)).
       sack(div).
         by(union(constant(1d),
                  tail(local, 1)).sum()).
       sack(mult).
         by(limit(local, 1)).
       sack().sum()).
  cap('a')

Solution

  • I don't fully understand your question, but I read it to mean that you want to find the unique paths among these vertices, which you could get with:

    gremlin> g.V().has('uuid','e3348740-d37f-1031-8b5b-89fbb6fdad64').
    ......1>   repeat(out().simplePath()).
    ......2>     until(has('uuid','2d9fba40-74c7-1033-8e84-d3a6c90ad2e9')).
    ......3>   path().
    ......4>   dedup()
    ==>[v[2],v[8]]
    ==>[v[2],v[4],v[8]]
    ==>[v[2],v[6],v[8]]
    ==>[v[2],v[0],v[8]]
    ==>[v[2],v[4],v[6],v[8]]
    ==>[v[2],v[4],v[0],v[8]]
    ==>[v[2],v[6],v[0],v[8]]
    ==>[v[2],v[6],v[4],v[8]]
    ==>[v[2],v[0],v[6],v[8]]
    ==>[v[2],v[0],v[4],v[8]]
    ==>[v[2],v[4],v[6],v[0],v[8]]
    ==>[v[2],v[4],v[0],v[6],v[8]]
    ==>[v[2],v[6],v[0],v[4],v[8]]
    ==>[v[2],v[6],v[4],v[0],v[8]]
    ==>[v[2],v[0],v[6],v[4],v[8]]
    ==>[v[2],v[0],v[4],v[6],v[8]]
    

    Of course, given the structure of your graph, you have to traverse a ton of paths to get to that:

    gremlin> g.V().has('uuid','e3348740-d37f-1031-8b5b-89fbb6fdad64').
    ......1>   repeat(out().simplePath()).
    ......2>     until(has('uuid','2d9fba40-74c7-1033-8e84-d3a6c90ad2e9')).
    ......3>   path().
    ......4>   count()
    ==>34794
    

    I'd further assume that this is sample data and that your actual data could have even more paths to evaluate. You would need to keep any eye on this number as well as the total number of edges you could expect to be traversed in your worst case scenario for this traversal when considering performance implications. Note that indices will not be helpful with this traversal beyond the initial vertex lookup, however, if you were to further filter the paths in some way to reduce the number of edges to be returned from the database, vertex centric indices might be quite helpful here (e.g. "find me all paths on '2017-11-02T00:00:00Z'").