Search code examples
gremlintinkerpopgremlin-serverazure-cosmosdb-gremlinapi

How to merge values from different objects in gremlin query?


I have a query that returns the output in the following format,

{
    "Key": [
      "Value1",
      "Value2"
    ],
    "Count": [
      {
        "Count1": 28,
        "Count2": 28
      },
      {
        "Count3": 16,
        "Count4": 16
      }
    ]
  }

I want to display it in the following format

[
  {
     "Key" : "Value1",
     "Count1": 28,
     "Count2": 28
  },
  {
     "Key" : "Value2",
     "Count3": 16,
     "Count4": 16
  }
]

Is it possible?

The gremlin that produces a similar output


g.V().
has('organizationId', 'b121672e-8049-40cc-9f28-c62dff4cc2d9').
hasLabel('employee').

group(). 
by('officeId').
by(project('Id', 'Status').
    by(choose(has('officeId'), constant('Total'), constant(''))).
    by(coalesce(out('hasStatus').
                or(
                        has('release', is(false)),                          
                        has('autoRelease', is(true)).
                        has('release', is(true)).
                        has('endDate', gte(637250976000000000))
                 ), values('status'), constant('Green'))).
    select(values).
    unfold().
    groupCount()).

    project('Id', 'Count').
    by(select(keys)).
    by(select(values))

And the data that I have is an employee vertex and a healthStatus vertex, there's an hasStatus edge between employee and healthStatus

Properties in employee vertex: id, organizations, officeId, Name, createdOn

Properties in healthStatus vertex: id, status, startDate, endDate, release, autoRelease, createdOn

Sample Data

 g.addV('employee').
       property('id',1).
       property('organizationId',1).
       property('officeId',1).
       property('name','A').
       property('createdOn', 637263231140000000).as('1').
    addV('employee').
       property('id',2).
       property('organizationId',1).
       property('officeId',2).
       property('name','B').
       property('createdOn', 637263231140000000).as('2').
   addV('employee').
       property('id',5).
       property('organizationId',1).
       property('officeId',3).
       property('name','C').
       property('createdOn', 637263231140000000).as('5').
    addV('healthStatus').
       property('id',3).
       property('status','Red').
       property('startDate',637262367140000000).
       property('endDate',637264095140000000).
       property('release',false).
       property('createdOn',637262367140000000)as('3').
    addV('healthStatus').
       property('id',4).
       property('status','Yellow').
       property('startDate',637262367140000000).
       property('endDate',637264095140000000).
       property('release',false).
       property('createdOn',637262367140000000)as('4').
    addE('hasStatus').from('1').to('3').
    addE('hasStatus').from('4').to('4')

Output:

[
  {
    "Id" : [
        1,
        2,
        3
      ]
  },
  {
    "Count": [
      {
         "Red" : 1
      },
      {
         "Yellow" : 1
      },
      {
         "Green" : 1
      }
    ]  
  }

Expected Output

[
  {
     "Id" : 1,
     "Red" : 1
  },
  {
     "Id" : 2,
     "Yellow" : 1
  },
  {
     "Id" : 3,
     "Green" : 1
  }
]

Note : This Id in projection is officeId from employee vertex


Solution

  • I think I've captured what you wanted. There were some errors in your sample data script and I wanted some extra data to make sure counts were making sense so I added a bit:

     g = TinkerGraph.open().traversal()
     g.addV('employee').
           property('id',1).
           property('organizationId',1).
           property('officeId',1).
           property('name','A').
           property('createdOn', 637263231140000000).as('1').
        addV('employee').
           property('id',2).
           property('organizationId',1).
           property('officeId',2).
           property('name','B').
           property('createdOn', 637263231140000000).as('2').
       addV('employee').
           property('id',5).
           property('organizationId',1).
           property('officeId',3).
           property('name','C').
           property('createdOn', 637263231140000000).as('5').
       addV('employee').
           property('id',6).
           property('organizationId',1).
           property('officeId',3).
           property('name','D').
           property('createdOn', 637263231140000000).as('6').
        addV('healthStatus').
           property('id',3).
           property('status','Red').
           property('startDate',637262367140000000).
           property('endDate',637264095140000000).
           property('release',false).
           property('createdOn',637262367140000000).as('3').
        addV('healthStatus').
           property('id',4).
           property('status','Yellow').
           property('startDate',637262367140000000).
           property('endDate',637264095140000000).
           property('release',false).
           property('createdOn',637262367140000000).as('4').
        addE('hasStatus').from('1').to('3').
        addE('hasStatus').from('2').to('4').
        addE('hasStatus').from('6').to('4')
    

    I've re-written you traversal a bit to provide a different approach that I think provides the data you expect, however in a slightly different form:

    gremlin> g.V().has('employee','organizationId', 1).
    ......1>   project('Id', 'Status').
    ......2>     by('officeId').
    ......3>     by(coalesce(out('hasStatus').
    ......4>                 or(has('release', false),                          
    ......5>                    has('autoRelease', true).has('release', true).has('endDate', gte(637250976000000000))). 
    ......6>                 values('status'), 
    ......7>                 constant('Green'))).
    ......8>   group().
    ......9>     by(select('Id')).
    .....10>     by(groupCount().
    .....11>          by('Status'))
    ==>[1:[Red:1],2:[Yellow:1],3:[Yellow:1,Green:1]]
    

    I prefer this form a bit, but perhaps you require the original format you inquired about, in which case you need another round of manipulation on the collection:

    gremlin> g.V().has('employee','organizationId', 1).
    ......1>   project('Id', 'Status').
    ......2>     by('officeId').
    ......3>     by(coalesce(out('hasStatus').
    ......4>                 or(has('release', false),                          
    ......5>                    has('autoRelease', true).has('release', true).has('endDate', gte(637250976000000000))). 
    ......6>                 values('status'), 
    ......7>                 constant('Green'))).
    ......8>   group().
    ......9>     by(select('Id')).
    .....10>     by(groupCount().
    .....11>          by('Status')).
    .....12>   unfold().
    .....13>   map(union(project('Id').by(select(keys)),
    .....14>             select(values)).
    .....15>       unfold().
    .....16>       group().by(keys).by(select(values)))
    ==>[Red:1,Id:1]
    ==>[Yellow:1,Id:2]
    ==>[Yellow:1,Id:3,Green:1]