Search code examples
gremlingraph-databasesamazon-neptunetinkerpop3azure-cosmosdb-gremlinapi

Gremlin simple path query, to get path based on first edge encountered property


Sample Graph - actual graph Image See here

Code to generate Vertex

        graph.addV("organization")
            .property("name", "CITI")
            .property("type", "ORG")
            .property(T.id, "1")
            .property("orgName", "CITI")
            .iterate();
        graph.addV("component")
            .property("name", "comop1")
            .property("type", "Physical")
            .property("orgName", "CITI")
            .property("app", "APP1")
            .property(T.id, "4013496")
            .iterate();
        graph.addV("component")
            .property("name", "comp2")
            .property("app", "APP2")
            .property("orgName", "ORG")
            .property("type", "System")
            .property(T.id, "2105820")
            .iterate();
        graph.addV("component")
            .property("name", "comp3")
            .property("app", "APP2")
            .property("orgName", "CITI")
            .property("type", "Logical")
            .property(T.id, "2105830")
            .iterate();
        graph.addV("component")
            .property("name", "comp4")
            .property("app", "APP2")
            .property("orgName", "CITI")
            .property("type", "Logical")
            .property(T.id, "2100982")
            .iterate();
        graph.addV("component")
            .property("name", "comp5")
            .property("app", "APP3")
            .property("orgName", "CITI")
            .property("type", "Logical")
            .property(T.id, "4007086")
            .iterate();
        graph.addV("component")
            .property("name", "comp6")
            .property("app", "APP3")
            .property(T.id, "4007087")
            .property("orgName", "CITI")
            .property("type", "Logical")
            .iterate();
        graph.addV("component")
            .property("name", "comp7")
            .property("app", "APP3")
            .property("orgName", "CITI")
            .property("type", "Logical")
            .property(T.id, "4003585")
            .iterate();
        graph.addV("component")
            .property("name", "comp8")
            .property("app", "APP3")
            .property("orgName", "CITI")
            .property("type", "Logical")
            .property(T.id, "4003586")
            .iterate();
        
        graph.addV("organization")
            .property("name", "BOFA")
            .property("orgName", "BOFA")
            .property("type", "Logical")
            .property(T.id, "2")
            .iterate();
        graph.addV("organization")
            .property("name", "JPMC")
            .property("orgName", "JPMC")
            .property("type", "Logical")
            .property(T.id, "3")
            .iterate();

Code to generate Edges

        graph.addE("commercialService").property("name", "CS1").property(T.id, "edge1").from(__.V("1")).to(__.V("4013496")).iterate();
        graph.addE("commercialService").property("name", "CS2").property(T.id, "edge2").from(__.V("1")).to(__.V("4013496")).iterate();
        
        graph.addE("commercialService").property("name", "CS1").property(T.id, "edge3").from(__.V("4013496")).to(__.V("2105820")).iterate();
        graph.addE("commercialService").property("name", "CS2").property(T.id, "edge4").from(__.V("4013496")).to(__.V("2105820")).iterate();

        graph.addE("commercialService").property("name", "CS1").property(T.id, "edge5").from(__.V("2105820")).to(__.V("2105830")).iterate();
        graph.addE("commercialService").property("name", "CS2").property(T.id, "edge6").from(__.V("2105820")).to(__.V("2105830")).iterate();

        graph.addE("commercialService").property("name", "CS1").property(T.id, "edge7").from(__.V("2105830")).to(__.V("2100982")).iterate();
        graph.addE("commercialService").property("name", "CS2").property(T.id, "edge8").from(__.V("2105830")).to(__.V("2100982")).iterate();

        graph.addE("commercialService").property("name", "CS1").property(T.id, "edge9").from(__.V("2100982")).to(__.V("4007086")).iterate();
        graph.addE("commercialService").property("name", "CS2").property(T.id, "edge10").from(__.V("2100982")).to(__.V("4007087")).iterate();

        graph.addE("commercialService").property("name", "CS1").property(T.id, "edge11").from(__.V("4007086")).to(__.V("4003585")).iterate();
        graph.addE("commercialService").property("name", "CS2").property(T.id, "edge12").from(__.V("4007087")).to(__.V("4003586")).iterate();

        graph.addE("commercialService").property("name", "CS1").property(T.id, "edge13").from(__.V("4003585")).to(__.V("2")).iterate();
        graph.addE("commercialService").property("name", "CS2").property(T.id, "edge14").from(__.V("4003586")).to(__.V("3")).iterate();

I have this sample graph, initially 2 edges are coming until they separate out, E1,E2...E14 are the edge ID's and CS1 and CS2 are "name" property of edge.(See Image attached above "Sample Graph")

I am trying to get simple path using the below query

This is a java gremlin query

graph.V("1").
      repeat(outE().otherV().simplePath()).
      until(outE().count().is(0)).
      dedup().
      group().
        by("name").
        by(path()).
      next();

This gives me result as Map<Object, Object>, where key as JPMC and BOFA and 2 different path's as map value.

path[v[1], e[edge1][1-commercialService->4013496], v[4013496], e[edge4][4013496-commercialService->2105820], v[2105820], e[edge6][2105820-commercialService->2105830], v[2105830], e[edge7][2105830-commercialService->2100982], v[2100982], e[edge10][2100982-commercialService->4007087], v[4007087], e[edge12][4007087-commercialService->4003586], v[4003586], e[edge14][4003586-commercialService->3], v[3]]

But when iterate over this path in Java and try to find the edge property "name", I am getting value as CS1 and CS2. It seems when graph is preparing the path it doesn't matter which edge is used to reach the next node.

Where as I am looking for something where we can get the path grouped by "name" property of the edge, like below

path[v[1], e[edge1][1-commercialService->4013496], v[4013496], e[edge3][4013496-commercialService->2105820], v[2105820], e[edge5][2105820-commercialService->2105830], v[2105830], e[edge7][2105830-commercialService->2100982], v[2100982], e[edge9][2100982-commercialService->4007087], v[4007087], e[edge11][4007087-commercialService->4003586], v[4003586], e[edge13][4003586-commercialService->3], v[2]]

2nd Solution tried

graph.V(orgId).repeat(outE().order().by("name").otherV().simplePath()).until(outE().count().is(0)).dedup().path().toList();

This time it is always traversing through single Edge, till we reach the common node. Output :

path[v[1], e[edge1][1-commercialService->4013496], v[4013496], e[edge3][4013496-commercialService->2105820], v[2105820], e[edge5][2105820-commercialService->2105830], v[2105830], e[edge7][2105830-commercialService->2100982], v[2100982], e[edge9][2100982-commercialService->4007086], v[4007086], e[edge11][4007086-commercialService->4003585], v[4003585], e[edge13][4003585-commercialService->2], v[2]]

path[v[1], e[edge1][1-commercialService->4013496], v[4013496], e[edge3][4013496-commercialService->2105820], v[2105820], e[edge5][2105820-commercialService->2105830], v[2105830], e[edge7][2105830-commercialService->2100982], v[2100982], e[edge10][2100982-commercialService->4007087], v[4007087], e[edge12][4007087-commercialService->4003586], v[4003586], e[edge14][4003586-commercialService->3], v[3]]

  1. There is also a way to pass on the "name" property value in the query itself to follow particular path. But I don't have that value with me to pass on. Instead I am thinking if we can some how use the "name" property from the very first edge we encounter in the path ?
  2. Also is there any way to get all the properties of vertex/edge populated when we fetch the path ?

Solution

  • I initially tried using sack to write the query but I ran into an unexpected issue. I then decided to use as labelling. This resulted with the query below:

    g.V('1').
      outE().as('e').
      values('name').as('n').
      select('e').
      inV().
      repeat(
        outE().where(eq('n')).by('name').by().inV()).
      until(not(out())).
      path().from('e').
        by('name').
        by(label)
    

    This yields

    1   path[CS1, component, CS1, component, CS1, component, CS1, component, CS1, component, CS1, component, CS1, organization]
    2   path[CS2, component, CS2, component, CS2, component, CS2, component, CS2, component, CS2, component, CS2, organization]
    

    You can of course label the path however you wish.

    g.V('1').
      outE().as('e').
      values('name').as('n').
      select('e').
      inV().
      repeat(
        outE().where(eq('n')).by('name').by().inV()).
      until(not(out())).
      path().from('e').
        by('name')
    

    Which yields:

    1   path[CS1, comop1, CS1, comp2, CS1, comp3, CS1, comp4, CS1, comp5, CS1, comp7, CS1, BOFA]
    2   path[CS2, comop1, CS2, comp2, CS2, comp3, CS2, comp4, CS2, comp6, CS2, comp8, CS2, JPMC]
    

    Using this approach, if you want the starting vertex in the path result, we have to jiggle things a bit so that the value (edge name) we calculated before the repeat starts is not itself part of the path result. In the prior examples the from step just started the path with the first edge. The version below includes the initial vertex.

    g.V('1').as('v1').
      outE().values('name').as('n').
      select('v1').as('start').
      repeat(
        outE().where(eq('n')).by('name').by().inV()).
      until(not(out())).
      path().from('start').
        by('name')
    

    and we now get:

    1   path[CITI, CS1, comop1, CS1, comp2, CS1, comp3, CS1, comp4, CS1, comp5, CS1, comp7, CS1, BOFA]
    2   path[CITI, CS2, comop1, CS2, comp2, CS2, comp3, CS2, comp4, CS2, comp6, CS2, comp8, CS2, JPMC]
    

    Here is a visual representation of the final query. Hopefully this matches your image that we started the discussion with.

    enter image description here

    enter image description here