Search code examples
indexingjanusgraphtinkerpop3gremlin-serverorientdb3.0

How To Create Vertex Index For Remote GraphTraversalSource


Problem

How do you make Vertex Indexes with traversal().withRemote("conf/remote-graph.properties")?

Questions

  1. Do I have to create a org.apache.tinkerpop.gremlin.structure.Graph?
    Graph graph = TinkerGraph.open(configuration);
    
  2. And if so, where do I find the configurations necessary for JanusGraph?
    1. As opposed to OrientDB 3.2.18 GA Community Edition with Orientdb-Gremlin OrientGraph.java
    Configuration configuration = new BaseConfiguration();
    configuration.setProperty(OrientGraph.CONFIG_URL, orientGraph_configUrl);
    configuration.setProperty(OrientGraph.CONFIG_USER, orientGraph_configUser);
    configuration.setProperty(OrientGraph.CONFIG_PASS, orientGraph_configPass);
    configuration.setProperty(OrientGraph.CONFIG_TRANSACTIONAL, true);
    OrientGraph orientGraph = OrientGraph.open(configuration);
    GraphTraversalSource g = orientGraph.traversal();
    
  3. If I have to, how do you code the simplest version of a working-and-connecting JanusGraphManagement instance?

Findings

  1. g.V().index(); in org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal
    1. Sudo example here, but I just can't wrap my brain around this yet
      g.V().hasLabel("software").index() //1
      g.V().hasLabel("software").values("name").fold().
          order(Scope.local).
          index().
          unfold().
          order().
              by(__.tail(Scope.local, 1)) //2
      g.V().hasLabel("software").values("name").fold().
          order(Scope.local).
          index().
            with(WithOptions.indexer, WithOptions.list).
          unfold().
          order().
            by(__.tail(Scope.local, 1)) //3
      g.V().hasLabel("person").values("name").fold().
          order(Scope.local).
          index().
              with(WithOptions.indexer, WithOptions.map)  //4
      
  2. Tinkergraph.createIndex() in [ Gremlin Applications > Gremlin Server > Security > Credentials Graph DSL]
    graph = TinkerGraph.open()
    graph.createIndex("username",Vertex.class)
    
  3. JanusGraph Indexing with its JanusGraphManagement.buildIndex(String, Class) but I'm trying to stick closer to Gremlin and be more agnostic in coding this

Code

I can see how if I was only using a blank TinkerGraph in Gremlin, but
I need to an existing JanusGraph [GraphDB] on localhost:8182.

graph = TinkerGraph.open()
g = traversal().withEmbedded(graph)
g.io('data/grateful-dead.xml').read().iterate()
clock(1000) {g.V().has('name','Garcia').iterate()} //// (1)
graph = TinkerGraph.open()
g = traversal().withEmbedded(graph)
graph.createIndex('name',Vertex.class)
g.io('data/grateful-dead.xml').read().iterate()
clock(1000){g.V().has('name','Garcia').iterate()} //2

Solution

  • The Gremlin language does not have schema/index functionality. Any indexing you need to create for your graph is going to be graph database specific. For example with JanusGraph you would need to use their Management API and with TinkerGraph you would create indices you would use the TinkerGraph Index API. As neither of these APIs (or other Graph providers indexing capabilities) are part of the Gremlin language, you cannot configure them remotely unless you use them by way of submitting scripts to the server (through Gremlin Console or with a driver like Java.

    I'd say that typically, you setup indices as an administrative function. For most graphs, like JanusGraph you would probably write a index creation script with their APIs, connect to JanusGraph through Gremlin Console and execute it. I suppose you could also work schema management into the code of your application as part of your versioning system as well and execute it as part of some application startup.

    For TinkerGraph, it's a bit different because it is an in-memory graph. You need to define the index on creation of the graph each time because the index is not persisted. For Gremlin Server hosted TinkerGraphs, like your case, you would do that as part of Gremlin Server's initialization script (example here). You would do changes the onStartUp to execute your index creation code:

    onStartUp: { ctx ->
        graph.createIndex("name",Vertex.class)
    },
    

    Here, the graph variable comes from your server yaml file as shown here in the example that comes with the server.