I want to find duplicate records and update the property isDuplicate
to yes
.
I am able to find duplicate records, couldn't find way to update the property.
g.V() \
.has("customerId") \
.group().by("customerId") \
.unfold() \
.toList()
The above query returns single records also. I want to remove them as well.
Here is one way to do it:
gremlin> g = TinkerGraph.open().traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> g.addV('person').property('customerId','alice').
......1> addV('person').property('customerId','alice').
......2> addV('person').property('customerId','bob').
......3> addV('person').property('customerId','alice').iterate()
gremlin> g.V().hasLabel('person').has('customerId').
......1> group().by('customerId').
......2> unfold().
......3> select(values).filter(count(local).is(gt(1))).unfold().
......4> property('isDuplicate','yes')
==>v[0]
==>v[2]
==>v[6]
gremlin> g.V().elementMap()
==>[id:0,label:person,customerId:alice,isDuplicate:yes]
==>[id:2,label:person,customerId:alice,isDuplicate:yes]
==>[id:4,label:person,customerId:bob]
==>[id:6,label:person,customerId:alice,isDuplicate:yes]