Search code examples
graphgremlintitan

Common ancestor in a titan graph..


In a titan(hbase) graph, i would like to get common ancestors for given nodes. Using gremlin is there any way we can get the common ancestors? Or any other Utils / libs available to find common ancestors in a Titan graph for given nodes?


Solution

  • I'll make some assumptions on exactly what you're looking for here, but maybe this answer will help inspire you to come up with what you are looking for. I used the toy graph to demonstrate this approach and assumed the edges directionally pointed to children. Obviously this isn't a perfect tree, but there's enough data there to at least demonstrate the traversal. The following shows the "children" for whom I wanted to find ancestors:

    gremlin> g = TinkerGraphFactory.createTinkerGraph()
    ==>tinkergraph[vertices:6 edges:6]
    gremlin> children = [g.v(2),g.v(5),g.v(6)]
    ==>v[2]
    ==>v[5]
    ==>v[6]
    

    I can recursively loop up through the tree through ancestors and see the paths as follows:

    gremlin> children._().in.loop(1){it.loops<5}{true}.path
    ==>[v[2], v[1]]
    ==>[v[5], v[4]]
    ==>[v[5], v[4], v[1]]
    

    The above line caps the loop at 5 steps away from the ancestor and the {true} closure simply means to output values at all steps of the loop. A common ancestor would be one where the each path terminates more than once:

    gremlin> children._().in.loop(1){it.loops<5}{true}.groupCount.cap.next().findAll{it.value>1}
    ==>v[1]=2
    

    In this sense v[1] is the only common ancestor among the three child nodes and only 2 of the 3 child nodes share that ancestor. To make it more interesting, let's connect vertex 6 to vertex 4 and re-execute the traversal:

    gremlin> children._().in.loop(1){it.loops<5}{true}.groupCount.cap.next().findAll{it.value>1}
    ==>v[1]=3
    ==>v[4]=2
    

    Now vertex 4 is included as a common ancestor. It has children of vertex 6 and 5. Given the shared nature of vertex 4, the 6 also share vertex 1 now so that ancestor is now shared by all three vertices.

    UPDATE: The above answer is for TinkerPop 2.x and the following is for TinkerPop 3.x.

    Given this tree:

    gremlin> g.addV().property(id, 'A').as('a').
               addV().property(id, 'B').as('b').
               addV().property(id, 'C').as('c').
               addV().property(id, 'D').as('d').
               addV().property(id, 'E').as('e').
               addV().property(id, 'F').as('f').
               addV().property(id, 'G').as('g').
               addE('hasParent').from('a').to('b').
               addE('hasParent').from('b').to('c').
               addE('hasParent').from('d').to('c').
               addE('hasParent').from('c').to('e').
               addE('hasParent').from('e').to('f').
               addE('hasParent').from('g').to('f').iterate()
    

    you can find the common ancestor with:

    gremlin> g.V('A').
    ......1>   repeat(out('hasParent')).emit().as('x').
    ......2>   repeat(__.in('hasParent')).emit(hasId('D')).
    ......3>   select('x').limit(1)
    ==>v[C]
    

    This topic is discussed more fully in the TinkerPop recipes.