Search code examples
javagremlingremlin-server

What is the best approach from the 2 provided below to validate gremlin's node and edge?


I am connected to a gremlin server (version 3.4.0) from my java application using the gremlin-driver (version 3.4.0). I am using the the following code to connect to the server from Java.

Cluster cluster = Cluster.build("localhost").port(8182).create();
Client client = cluster.connect();
GraphTraversalSource graphTraversalSource = AnonymousTraversalSource.traversal()
    .withRemote(DriverRemoteConnection.using(client, "g"));

// To get the list of vertices
List<Vertex> vertices = graphTraversalSource.V().toList();

//To add a vertex
GraphTraversal newNode = graphTraversalSource.addV("Label 1");

//To add properties to the vertex
newNode.property("key1","value1");
newNode.property("key2",1002);

Now, I have a requirement that each vertex must have some predefined but dynamic properties like name, uuid, etc. These predefined properties may vary from Vertex to Vertex (based on the vertex label) and can change in future; hence dynamic. Due to this dynamics I can not use predefined gremlin schema.

Now I think I have two option on how to implement it.

Approach 1. I can keep the validation logic on my java application and pass to gremlin only if it is valid.

Approach 2. I can implement some traversal strategy like the EventStrategy

The first option is straight forward and no rocket science there. For the second option I am facing the following problems.

Issue 1. I can not find any reference where they have implemented remote and strategy both with the same GraphTraversalSource.

Issue 2. How to stop the creation of Vertex if there is a validation failure.

I tried the following for implementing remote and strategy both with the same GraphTraversalSource but it give me serialization error.

// Here GremlinMutationListener is a class which implements MutationListener

MutationListener mutationListener = new GremlinMutationListener();
EventStrategy eventStrategy = EventStrategy.build().addListener(mutationListener).create();
GraphTraversalSource graphTraversalSource = AnonymousTraversalSource.traversal()
    .withRemote(DriverRemoteConnection.using(client, "g"))
    .withStrategies(eventStrategy);

the error I get is

Caused by: java.lang.IllegalArgumentException: Class is not registered: org.apache.tinkerpop.gremlin.process.traversal.strategy.decoration.EventStrategy
Note: To register this class use: kryo.register(org.apache.tinkerpop.gremlin.process.traversal.strategy.decoration.EventStrategy.class);

Also in the MutationListener I do not find a way how to stop the execution and return the validation error, besides throwing exception; which might have a lot of overheads

public class GremlinMutationListener implements MutationListener {
    private static final Logger LOGGER =
            LoggerFactory.getLogger(GremlinMutationListener.class);

    @Override
    public void vertexAdded(Vertex vertex) {
        LOGGER.info("SS: vertexAdded " + StringFactory.vertexString(vertex));
        // How can I return the validation error from here besides throwing exception?
        // Is there some other interface which I should implement?
    }

    .
    .
    .
    .

Now the question is What is the best approach here 1 or 2, considering performance. And if it is 2 how to resolve the issues (1 and 2) I am facing.


Solution

  • EventStrategy isn't a good way to do validation. You won't get notification of the event until after the change has already occurred in the underlying graph so a validation error would come too late.

    I do think that a TraversalStrategy can be a neat way to implement validation though. I think that you would:

    1. Implement your own ValidationTraversalStrategy to look for any mutation steps then examine their contents for "bad data" throwing an exception if there is a problem. Since strategy application occurs before traversal iteration you would stop the traversal before it made modifications to the underlying graph.
    2. Configure "g" in Gremlin Server to use the have the strategy setup server side so that all connections to it get the benefit of that strategy automatically.

    The downside here is that not all graphs support the ability to include custom traversal strategies so you need to be ok with reduced code portability by taking this approach.

    Another approach which is more portable (and perhaps easier) is to build a Gremlin DSL. In this way you can implement your validation client-side right at the time the traversal is constructed. For example you could add a step like:

    public default GraphTraversal<S, Vertex> person(String personId, String name) {
        if (null == personId || personId.isEmpty()) throw new IllegalArgumentException("The personId must not be null or empty");
        if (null == name || name.isEmpty()) throw new IllegalArgumentException("The name of the person must not be null or empty");
    
        return coalesce(__.V().has(VERTEX_PERSON, KEY_PERSON_ID, personId),
                        __.addV(VERTEX_PERSON).property(KEY_PERSON_ID, personId)).
                property(KEY_NAME, name);
    }
    

    That example is taken from the KillrVideo example repo - you can look there for more inspiration and also consider the related blog posts tied to that repo:

    1. https://www.datastax.com/dev/blog/gremlin-dsls-in-java-with-dse-graph
    2. https://academy.datastax.com/content/gremlin-dsls-python-dse-graph
    3. https://academy.datastax.com/content/gremlin-dsls-net-dse-graph

    Even though these blog posts use different programming languages, the content of each post is applicable to anyone using Gremlin from any language.