Search code examples
.netgremlinazure-cosmosdb-gremlinapigremlinnet

How to filter related objects using Gremlin Cosmos DB?


I want to get a ResultSet of Objects that consist of a user and a list of all users that are not yet related to this user. The result should look like this:

[[user: [USEROBJECT], usersThatAreNotFriends: [[USEROBJECT]...]]...]

I am using the Cosmos DB Gremlin Endpoint and struggle with filtering/combining the users that are already related and all users.

My idea was:

g.V().hasLabel('user').as('user').flatMap(g.V().hasLabel('user').where(__.eq(select('user').out('isFriend')).fold()).as('usersThatAreNotFriends').select('user', 'usersThatAreNotFriends')

To set up my example use:

g.addV('user').property('id','user_1').property('partition_key','1')
g.addV('user').property('id','user_2').property('partition_key','2')
g.addV('user').property('id','user_3').property('partition_key','3')
g.addV('user').property('id','user_4').property('partition_key','4')
g.V('user_1').addE('has_relation').to(g.V('user_2'))
g.V('user_2').addE('has_relation').to(g.V('user_1'))
g.V('user_2').addE('has_relation').to(g.V('user_4'))
g.V('user_4').addE('has_relation').to(g.V('user_2'))

The expected result should be represented in simple way:

[user_1: [user_3, user_4], user_2:[user_3], 
user_3:[user_1, user_2, user_3], user_4:[user_1, user_3]]

Solution

  • I modified your query to add the IDs as real IDs

    g.addV('user').property(id,'user_1').property('partition_key','1')
    g.addV('user').property(id,'user_2').property('partition_key','2')
    g.addV('user').property(id,'user_3').property('partition_key','3')
    g.addV('user').property(id,'user_4').property('partition_key','4')
    g.V('user_1').addE('has_relation').to(g.V('user_2'))
    g.V('user_2').addE('has_relation').to(g.V('user_1'))
    g.V('user_2').addE('has_relation').to(g.V('user_4'))
    g.V('user_4').addE('has_relation').to(g.V('user_2'))   
    

    In your example sometimes you showed the same person in the result so I am suggesting two different queries. The first includes the person as not being friends with themselves.

    gremlin> g.V().as('p').
    ......1>    project('person','not-friends').
    ......2>      by().
    ......3>      by(V().where(__.not(__.in('has_relation').as('p'))).fold())
    ==>[person:v[user_3],not-friends:[v[user_3],v[user_2],v[user_1],v[user_4]]]
    ==>[person:v[user_2],not-friends:[v[user_3],v[user_2]]]
    ==>[person:v[user_1],not-friends:[v[user_3],v[user_1],v[user_4]]]
    ==>[person:v[user_4],not-friends:[v[user_3],v[user_1],v[user_4]]] 
    

    If you want to avoid having the person appear in the results as not friends with themselves you can do this:

    gremlin>  g.V().as('p').
    ......1>    project('person','not-friends').
    ......2>      by().
    ......3>     by(V().where(__.not(__.in('has_relation').as('p')).where(neq('p'))).fold())
    ==>[person:v[user_3],not-friends:[v[user_2],v[user_1],v[user_4]]]
    ==>[person:v[user_2],not-friends:[v[user_3]]]
    ==>[person:v[user_1],not-friends:[v[user_3],v[user_4]]]
    ==>[person:v[user_4],not-friends:[v[user_3],v[user_1]]]
    

    As a sidenote to find who people are friends you can use a simple group().by() approach.

    gremlin> g.V().aggregate('all').group().by().by(out().fold()).unfold()
    ==>v[user_3]=[]
    ==>v[user_2]=[v[user_1], v[user_4]]
    ==>v[user_1]=[v[user_2]]
    ==>v[user_4]=[v[user_2]]