Search code examples
gremlin

How to delete duplicate edges by inV outV and specific properties?


I have duplicate edges in graph, which have same inV, outV and some, but not all, properties. I would like to remove all but one of those duplicates.

Given following graph:

g.addV().property(id, '1').
addV().property(id, '2').
addV().property(id, '3').
addV().property(id, '4').
addE('link').property('prop1', 000).property('prop2', 111).from(V('1')).to(V('2')).
addE('link').property('prop1', 000).property('prop2', 112).from(V('1')).to(V('2')).
addE('link').property('prop1', 000).property('prop2', 113).from(V('1')).to(V('2')).
addE('link').property('prop1', 222).property('prop2', 333).from(V('2')).to(V('3')).
addE('link').property('prop1', 222).property('prop2', 334).from(V('2')).to(V('3')).
addE('link').property('prop1', 222).property('prop2', 335).from(V('2')).to(V('3')).
addE('link').property('prop1', 222).property('prop2', 336).from(V('2')).to(V('3')).
addE('link').property('prop1', 333).property('prop2', 444).from(V('2')).to(V('3')).
addE('link').property('prop1', 333).property('prop2', 444).from(V('3')).to(V('4')).
addE('link').property('prop1', 333).property('prop2', 445).from(V('3')).to(V('4')).
addE('link').property('prop1', 333).property('prop2', 446).from(V('3')).to(V('4')).iterate()

I would like to delete all duplicates by inV, outV and prop1 so only following edges would be left:

addE('link').property('prop1', 000).property('prop2', 111).from(V('1')).to(V('2')).
addE('link').property('prop1', 222).property('prop2', 336).from(V('2')).to(V('3')).
addE('link').property('prop1', 333).property('prop2', 444).from(V('2')).to(V('3')).
addE('link').property('prop1', 333).property('prop2', 446).from(V('3')).to(V('4'))

EDIT: To clarify, I want to deduplicate edges by checking inV, outV and prop1, if there are more than one edge with all these 3 parameters matching I want to keep one and remove the rest, regardless if prop2 is unique or not.


Solution

  • g.E().as('e').outV().id().as('ov').
      select('e').inV().id().as('iv').
      select('e').properties('prop1').value().as('p1').
      select('e', 'ov', 'iv', 'p1').
      group().
        by(select('ov', 'iv', 'p1')).
        by(select('e')).
      select(values).as('unique_e').
      V().outE().where(without('unique_e')).drop()