Search code examples
janusgraph

Composite index is not getting registered. It is struck in ENABLED status forever


I'm running janusgraph with cassandra and ES as backend. Following is my script used in building the composite index.

JanusGraphManagement mgmt = graph.openManagement()
def addPropertyKeyIfNotExists(JanusGraphManagement mgmt, String keyName, Class keyType, org.janusgraph.core.Cardinality cardinalityType) {
    if (!mgmt.containsPropertyKey(keyName)) mgmt.makePropertyKey(keyName).dataType(keyType).cardinality(cardinalityType).make()
}
vertexCompositeIndexName = "vertex_data_composite"

addPropertyKeyIfNotExists(mgmt, "vertex_id", String.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "tenant", String.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "entity", String.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "entity_type", String.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "first_seen_at", Long.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "first_seen_source", String.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "last_seen_at", Long.class, org.janusgraph.core.Cardinality.SINGLE)

vertexId = mgmt.getPropertyKey("vertex_id")
tenant = mgmt.getPropertyKey("tenant")
entity = mgmt.getPropertyKey("entity")
entityType = mgmt.getPropertyKey("entity_type")
firstSeenAt = mgmt.getPropertyKey("first_seen_at")
firstSeenSource = mgmt.getPropertyKey("first_seen_source")
lastSeenAt = mgmt.getPropertyKey("last_seen_at")
if (!mgmt.containsGraphIndex(vertexCompositeIndexName)) {
    mgmt.buildIndex(vertexCompositeIndexName, Vertex.class).
        addKey(vertexId).
        addKey(tenant).
        addKey(entity).
        addKey(firstSeenSource).
        addKey(entityType).
        addKey(firstSeenAt).
        addKey(lastSeenAt).
        buildCompositeIndex()
}


println(mgmt.printSchema())
mgmt.commit()
mgmt.close()
graph.close()

graph = JanusGraphFactory.open("/etc/opt/janusgraph/janusgraph.properties")
mgmt = graph.openManagement()
mgmt.awaitGraphIndexStatus(graph, vertexCompositeIndexName).call()

It has been more than an hour and still the composite index status is in ENABLED. It never became REGISTERED.

gremlin> mgmt.printSchema()
==>------------------------------------------------------------------------------------------------
Graph Index (Vertex)           | Type        | Unique    | Backing        | Key:           Status |
---------------------------------------------------------------------------------------------------
vertex_data_composite          | Composite   | false     | internalindex  | vertex_id:    ENABLED |
                               |             |           |                | tenant:       ENABLED |
                               |             |           |                | entity:       ENABLED |
                               |             |           |                | first_seen_source:    ENABLED |
                               |             |           |                | entity_type:    ENABLED |
                               |             |           |                | first_seen_at:    ENABLED |
                               |             |           |                | last_seen_at:    ENABLED |
---------------------------------------------------------------------------------------------------

I see the following in my logs

jce-janusgraph   | 7953215 [gremlin-server-worker-1] INFO  org.janusgraph.graphdb.database.management.GraphIndexStatusWatcher  - Some key(s) on index vertex_data_composite do not currently have status(es) [REGISTERED]: entity_type=ENABLED,vertex_id=ENABLED,first_seen_at=ENABLED,first_seen_source=ENABLED,last_seen_at=ENABLED,tenant=ENABLED,entity=ENABLED
jce-janusgraph   | 7953216 [gremlin-server-worker-1] INFO  org.janusgraph.graphdb.database.management.GraphIndexStatusWatcher  - Timed out (PT1M) while waiting for index vertex_data_composite to converge on status(es) [REGISTERED]

Composite index not used while querying.

gremlin> g.V().has("vertex_id","ddd").profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
JanusGraphStep([],[vertex_id.eq(ddd)])                                                        49.819   100.00
  constructGraphCentricQuery                                                                  15.133
  constructGraphCentricQuery                                                                   0.074
  GraphCentricQuery                                                                           19.326
    \_condition=(vertex_id = ddd)
    \_orders=[]
    \_isFitted=false
    \_isOrdered=true
    \_query=[]
    scan                                                                                      17.445
    \_query=[]
    \_fullscan=true
    \_condition=VERTEX
                                            >TOTAL                     -           -          49.819        -

We see following in the logs

jce-janusgraph   | 339186 [gremlin-server-session-1] WARN  org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Query requires iterating over all vertices [(vertex_id = ddd)]. For better performance, use indexes

Solution

  • From JanusGraph index lifecycle you can see that your index is already ENABLED and can be used. You don't need to wait for the index to be REGISTERED in this particular case.
    Normally index transition in the following order:

    INSTALLED - The index is installed in the system but not yet registered with all instances in the cluster. REGISTERED - The index is registered with all instances in the cluster but not (yet) enabled. ENABLED - The index is enabled and in use. DISABLED - The index is disabled and no longer in use.

    Usually after your created the index it becomes INSTALLED, then you wait until all JanusGraph nodes pick up the newly created index and it changes the status to REGISTERED. As soon as it is REGISTERED (meaning all JanusGraph nodes know about it) you can enable the index or start REINDEX process which automatically enables the index after reindex is finished.
    So, why your index changed the status to ENABLED immediately instead of transitioning from state to state? That's because JanusGraph has a special optimization which enables newly created indexes immediately if all it's keys were created in the same transaction. In your situation all properties were created in the same transaction (most likely). Thus, your index is now ENABLED. You don't need to do anything else because your index is already in use.

    P.S. As a side topic, not directly related to this use-case but related to the issue when the index cannot change it's state from INSTALLED to REGISTERED you can checkout the following suggestions.