I am new in the graph database. It might be a very basic question but any help is really appreciated. I am using Gremlin/Tinkerpop 3 to query a graph stored in TitanDB.
Following is my data set.
User | associated_ids
---------------------------------
A | [ 1, 2 ]
B | [ 3, 4 ]
c | [ 5 ]
D | [ 4, 2 ]
Now want to get all the user's group who are linked with each other with any of the associated_id. Here [A] & [D] is directly linked as both have common entity [2] and [B] & [D] is directly linked as both have common entity [4]. So indirectly [A], [B] & [D] are linked.
Expected result something like.
Users | associated_ids
----------------------------
A,B,D | [ 1, 2 ,3, 4]
c | [ 5 ]
Following are the steps I have performed to generate the graph.
a = graph.addVertex(label, "person", "user", "A")
b = graph.addVertex(label, "person", "user", "B")
c = graph.addVertex(label, "person", "user", "C")
d = graph.addVertex(label, "person", "user", "D")
one = graph.addVertex('rec_id')
one.property('ids', '1')
two = graph.addVertex('rec_id')
two.property('ids', '2')
three = graph.addVertex('rec_id')
three.property('ids', '3')
four = graph.addVertex('rec_id')
four.property('ids', '4')
five = graph.addVertex('rec_id')
five.property('ids', '5')
a.addEdge('part_of',one)
a.addEdge('part_of',two)
b.addEdge('part_of', three)
b.addEdge('part_of',four)
c.addEdge('part_of',five)
d.addEdge('part_of',four)
d.addEdge('part_of',two)
Now wanted to know how can I get all the users who are associated with each other via any of the rec_id
.
What would be the Gremlin query to get the desired output?
Also please let me know if I need to do any change in my graph structure to meet my requirement.
What you're basically looking for, are connected components.
gremlin> g.V().
emit(cyclicPath().or().not(both())).
repeat(both()).
until(cyclicPath()).
aggregate("p").by(path()).cap("p").
unfold().limit(local, 1).dedup().
map(__.as("v").select("p").unfold().
filter(unfold().where(eq("v"))).
unfold().dedup().order().by(id).fold()).dedup().
project("Users","associated_ids").
by(unfold().hasLabel("person").values("user").fold()).
by(unfold().hasLabel("rec_id").values("ids").fold())
==>[Users:[A,B,D],associated_ids:[1,2,3,4]]
==>[Users:[C],associated_ids:[5]]