Search code examples
cassandragremlintinkerpopjanus

Get different label vertex for group aggregation based on date


    v1=graph.addVertex(label,"l1","submit_time",Fri Apr 26 21:01:36 PDT 2019) //v[2345432]
    v2=graph.addVertex(label,"l2","start_time",Fri Apr 26 22:01:36 PDT 2019) // v[409632904]
    v3=graph.addVertex(label,"l2","start_time",Fri Apr 26 22:01:36 PDT 2019)  //v[204824704]
    v4=graph.addVertex(label,"l2","start_time",Fri Apr 26 23:01:36 PDT 2019). //v[307241008]

    Edge e1 = v1.addEdge("e1", v2);
    Edge e2 = v1.addEdge("e1", v3);
    Edge e3 = v1.addEdge("e1", v4);




    g.V().hasLabel("l2").group().by(map{(it.get().value("start_time").getYear()+1900)+"/"+(it.get().value("start_time").getMonth()+1)+"/"+it.get().value("start_time").getDate()+" "+it.get().value("start_time").getHours()})

we are getting below output: Output1: 2019/4/26 23:[v[307241008]], 2019/4/26 22:[v[409632904],v[204824704]]

Can anyone please help me to get for each of the aggregated values(aggregated by l2 and all the l2 vertex has edge to l1 ), so I need to get its corresponding l1 label vertex also in single query. eg : Output2: 2019/4/26 23:[v[307241008]], v[2345432] 2019/4/26 22:[v[409632904],v[204824704]] ,v[2345432] Thanks.


Solution

  • Let me start with a proper script to create the sample graph, so others can easier follow along:

    g = TinkerGraph.open().traversal()
    g.addV('l1').
        property(id, 2345432).
        property('submit_time', new Date('Fri Apr 26 21:01:36 PDT 2019')).
      addV('l2').
        property(id, 409632904).
        property('start_time', new Date('Fri Apr 26 22:01:36 PDT 2019')).
      addV('l2').
        property(id, 204824704).
        property('start_time', new Date('Fri Apr 26 22:01:36 PDT 2019')).
      addV('l2').
        property(id, 307241008).
        property('start_time', new Date('Fri Apr 26 23:01:36 PDT 2019')).
      addE('e1').from(V(2345432)).to(V(409632904)).
      addE('e1').from(V(2345432)).to(V(204824704)).
      addE('e1').from(V(2345432)).to(V(307241008)).iterate()
    

    And your query properly formatted:

    g.V().hasLabel("l2").
      group().
        by {(it.value("start_time").getYear() + 1900) + "/" +
            (it.value("start_time").getMonth() + 1) + "/" +
             it.value("start_time").getDate() + " " +
             it.value("start_time").getHours()}
    

    Now, if you want to add all the l1 vertices, you can no longer use a simple Map for you result. Each entry needs its own map, so you can capture a third field. Thus, you need to unfold the map and reshape it with a project() step:

    g.V().hasLabel("l2").
      group().
        by {(it.value("start_time").getYear() + 1900) + "/" +
            (it.value("start_time").getMonth() + 1) + "/" +
             it.value("start_time").getDate() + " " +
             it.value("start_time").getHours()}.
      unfold().
      project('time','l2','l1').
        by(keys).
        by(values).
        by(select(values).unfold().in('e1').dedup().fold())
    

    This will yield:

    gremlin> g.V().hasLabel("l2").
    ......1>   group().
    ......2>     by {(it.value("start_time").getYear() + 1900) + "/" +
    ......3>         (it.value("start_time").getMonth() + 1) + "/" +
    ......4>          it.value("start_time").getDate() + " " +
    ......5>          it.value("start_time").getHours()}.
    ......6>   unfold().
    ......7>   project('time','l2','l1').
    ......8>     by(keys).
    ......9>     by(values).
    .....10>     by(select(values).unfold().in('e1').dedup().fold())
    ==>[time:2019/4/26 23,l2:[v[307241008]],l1:[v[2345432]]]
    ==>[time:2019/4/26 22,l2:[v[409632904],v[204824704]],l1:[v[2345432]]]