I'm running janusgraph with cassandra and ES as backend. Following is my script used in building the composite index.
JanusGraphManagement mgmt = graph.openManagement()
def addPropertyKeyIfNotExists(JanusGraphManagement mgmt, String keyName, Class keyType, org.janusgraph.core.Cardinality cardinalityType) {
if (!mgmt.containsPropertyKey(keyName)) mgmt.makePropertyKey(keyName).dataType(keyType).cardinality(cardinalityType).make()
}
vertexCompositeIndexName = "vertex_data_composite"
addPropertyKeyIfNotExists(mgmt, "vertex_id", String.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "tenant", String.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "entity", String.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "entity_type", String.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "first_seen_at", Long.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "first_seen_source", String.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "last_seen_at", Long.class, org.janusgraph.core.Cardinality.SINGLE)
vertexId = mgmt.getPropertyKey("vertex_id")
tenant = mgmt.getPropertyKey("tenant")
entity = mgmt.getPropertyKey("entity")
entityType = mgmt.getPropertyKey("entity_type")
firstSeenAt = mgmt.getPropertyKey("first_seen_at")
firstSeenSource = mgmt.getPropertyKey("first_seen_source")
lastSeenAt = mgmt.getPropertyKey("last_seen_at")
if (!mgmt.containsGraphIndex(vertexCompositeIndexName)) {
mgmt.buildIndex(vertexCompositeIndexName, Vertex.class).
addKey(vertexId).
addKey(tenant).
addKey(entity).
addKey(firstSeenSource).
addKey(entityType).
addKey(firstSeenAt).
addKey(lastSeenAt).
buildCompositeIndex()
}
println(mgmt.printSchema())
mgmt.commit()
mgmt.close()
graph.close()
graph = JanusGraphFactory.open("/etc/opt/janusgraph/janusgraph.properties")
mgmt = graph.openManagement()
mgmt.awaitGraphIndexStatus(graph, vertexCompositeIndexName).call()
It has been more than an hour and still the composite index status is in ENABLED. It never became REGISTERED.
gremlin> mgmt.printSchema()
==>------------------------------------------------------------------------------------------------
Graph Index (Vertex) | Type | Unique | Backing | Key: Status |
---------------------------------------------------------------------------------------------------
vertex_data_composite | Composite | false | internalindex | vertex_id: ENABLED |
| | | | tenant: ENABLED |
| | | | entity: ENABLED |
| | | | first_seen_source: ENABLED |
| | | | entity_type: ENABLED |
| | | | first_seen_at: ENABLED |
| | | | last_seen_at: ENABLED |
---------------------------------------------------------------------------------------------------
I see the following in my logs
jce-janusgraph | 7953215 [gremlin-server-worker-1] INFO org.janusgraph.graphdb.database.management.GraphIndexStatusWatcher - Some key(s) on index vertex_data_composite do not currently have status(es) [REGISTERED]: entity_type=ENABLED,vertex_id=ENABLED,first_seen_at=ENABLED,first_seen_source=ENABLED,last_seen_at=ENABLED,tenant=ENABLED,entity=ENABLED
jce-janusgraph | 7953216 [gremlin-server-worker-1] INFO org.janusgraph.graphdb.database.management.GraphIndexStatusWatcher - Timed out (PT1M) while waiting for index vertex_data_composite to converge on status(es) [REGISTERED]
Composite index not used while querying.
gremlin> g.V().has("vertex_id","ddd").profile()
==>Traversal Metrics
Step Count Traversers Time (ms) % Dur
=============================================================================================================
JanusGraphStep([],[vertex_id.eq(ddd)]) 49.819 100.00
constructGraphCentricQuery 15.133
constructGraphCentricQuery 0.074
GraphCentricQuery 19.326
\_condition=(vertex_id = ddd)
\_orders=[]
\_isFitted=false
\_isOrdered=true
\_query=[]
scan 17.445
\_query=[]
\_fullscan=true
\_condition=VERTEX
>TOTAL - - 49.819 -
We see following in the logs
jce-janusgraph | 339186 [gremlin-server-session-1] WARN org.janusgraph.graphdb.transaction.StandardJanusGraphTx - Query requires iterating over all vertices [(vertex_id = ddd)]. For better performance, use indexes
From JanusGraph index lifecycle you can see that your index is already ENABLED
and can be used. You don't need to wait for the index to be REGISTERED
in this particular case.
Normally index transition in the following order:
INSTALLED
- The index is installed in the system but not yet registered with all instances in the cluster.
REGISTERED
- The index is registered with all instances in the cluster but not (yet) enabled.
ENABLED
- The index is enabled and in use.
DISABLED
- The index is disabled and no longer in use.
Usually after your created the index it becomes INSTALLED
, then you wait until all JanusGraph nodes pick up the newly created index and it changes the status to REGISTERED
. As soon as it is REGISTERED
(meaning all JanusGraph nodes know about it) you can enable the index or start REINDEX
process which automatically enables the index after reindex is finished.
So, why your index changed the status to ENABLED
immediately instead of transitioning from state to state? That's because JanusGraph has a special optimization which enables newly created indexes immediately if all it's keys were created in the same transaction. In your situation all properties were created in the same transaction (most likely). Thus, your index is now ENABLED
. You don't need to do anything else because your index is already in use.
P.S. As a side topic, not directly related to this use-case but related to the issue when the index cannot change it's state from INSTALLED
to REGISTERED
you can checkout the following suggestions.