Search code examples
gremlinamazon-neptune

Deleting a single graph from Neptune (not all nodes!)


I have different, separated graphs on a Neptune instance. I need to delete only one of them. My idea is to specify the id of a node of that graph, then delete all nodes that can be accessed from there, following arcs IN BOTH DIRECTIONS, because many nodes (I will define them "reverse nodes") are bound to the graph with outgoing edges (pointing toward the graph). I can't find the right query; this one deletes all nodes except "reverse nodes":

g.V(source.getId()).emit().repeat(out()).drop().iterate();


Solution

  • If you need vertices in both directions (without cycles), this should work:

    g.V(source.getId()).emit().repeat(both().simplePath()).drop().iterate();
    

    A few things to consider:

    • If doing this at scale, you may consider fetching all vertices and edges in the subgraph first. Then drop all edges g.E(<list of edge ids>).drop() and then all vertices g.V(<list of vertex ids>).drop() as this will incur narrower locks by directly deleting these via ID.

    • Mutation queries in Neptune are single-threaded, so a single drop query will only ever run using a single query execution thread (half of a vCPU).

    • If taking the approach of dropping edges and vertices separately, you can also run parallel batches of edges and parallel batches of vertices to speed up the overall drop speed.