Search code examples
cassandra-3.0janusgraphtinkerpop3java-17gremlin-server

Why Is JanusGraph Finding Expired KeyColumn Claims?


Problem

What is causing JanusGraph to throw-out so many of this warnings going into full-blown retries? JanusGraph seems to find supposedly dropped vertexes, which is fine if there's a delay;
I just want to add a new vertex after vertex-dropping.

[WARN] [o.j.d.l.c.ConsistentKeyLocker.main] ::   Discarded expired claim on KeyColumn
Temporary exception during backend operation [CacheMutation]. Attempting backoff retry

Reproduction

Steps

  1. Create-and-Startup a Cassandra [Docker container]
    docker run --name jg-cassandra -d -e CASSANDRA_START_RPC=true -p 9160:9160 -p 9042:9042 -p 7199:7199 -p 7001:7001 -p 7000:7000 cassandra:3.11
    
  2. Load a large (or at least non-trival) dataset into JanusGraph
    1. A sample "medium-sized" dataset is ICIJ Offshore Dataleaks
    2. Medium meaning a larger dataset than someone could hard-code distinct and "random" data
  3. Create-and-Run this Java + Maven Project (code below)
  4. Results
    1. Expected: little-to-no errors-or-warnings with limit-of-1 per instance and then move on
    2. Actually: thrown a wall of warnings that increment k-value up by 100 each time
      Discarded expired claim on KeyColumn [k=0x0x7800000000008080, c=0x0x02] with timestamp 2023-05-09T20:16:08.988543Z
      

Logs

2023-05-10 09:23:19,141 [INFO] [c.d.o.d.i.c.ContactPoints.main] ::   Contact point localhost:9042 resolves to multiple addresses, will use them all ([localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1])
2023-05-10 09:23:19,226 [INFO] [c.d.o.d.i.c.DefaultMavenCoordinates.main] ::     DataStax Java driver for Apache Cassandra(R) (com.datastax.oss:java-driver-core) version 4.15.0
2023-05-10 09:23:19,726 [INFO] [c.d.o.d.i.c.t.Clock.JanusGraph Session-admin-0] ::   Using native clock for microsecond precision
2023-05-10 09:23:19,987 [WARN] [c.d.o.d.i.c.l.h.OptionalLocalDcHelper.JanusGraph Session-admin-0] ::     [JanusGraph Session|default] You specified datacenter1 as the local DC, but some contact points are from a different DC: Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=37423099)=null; please provide the correct local DC, or check your contact points
2023-05-10 09:23:20,231 [INFO] [o.j.g.i.UniqueInstanceIdRetriever.main] ::   Generated unique-instance-id=c0a8563c404-rmt-lap-win201
2023-05-10 09:23:20,252 [INFO] [c.d.o.d.i.c.ContactPoints.main] ::   Contact point localhost:9042 resolves to multiple addresses, will use them all ([localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1])
2023-05-10 09:23:20,297 [INFO] [c.d.o.d.i.c.t.Clock.JanusGraph Session-admin-0] ::   Using native clock for microsecond precision
2023-05-10 09:23:20,346 [WARN] [c.d.o.d.i.c.l.h.OptionalLocalDcHelper.JanusGraph Session-admin-0] ::     [JanusGraph Session|default] You specified datacenter1 as the local DC, but some contact points are from a different DC: Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=1ec16fcd)=null; please provide the correct local DC, or check your contact points
2023-05-10 09:23:20,362 [INFO] [o.j.d.c.ExecutorServiceBuilder.main] ::  Initiated fixed thread pool of size 40
2023-05-10 09:23:20,469 [INFO] [o.j.g.d.StandardJanusGraph.main] ::  Gremlin script evaluation is disabled
2023-05-10 09:23:20,494 [INFO] [o.j.d.l.k.KCVSLog.main] ::   Loaded unidentified ReadMarker start time 2023-05-10T14:23:20.494021Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@f1d0004
2023-05-10 09:23:20,549 [WARN] [o.j.g.t.StandardJanusGraphTx.main] ::    Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-10 09:23:21,835 [INFO] [Main.main] ::    g.V().count().next():  70005
2023-05-10 09:23:21,839 [WARN] [o.j.g.t.StandardJanusGraphTx.main] ::    Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-10 09:26:13,733 [WARN] [o.j.g.t.StandardJanusGraphTx.main] ::    Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-10 09:26:15,427 [INFO] [Main.main] ::    g.V().count().next():  0
2023-05-10 09:33:40,131 [WARN] [o.j.d.l.c.ConsistentKeyLocker.main] ::   Lock write succeeded but took too long: duration PT0.197258S exceeded limit PT0.1S
2023-05-10 09:38:03,812 [WARN] [o.j.d.l.c.ConsistentKeyLocker.main] ::   Discarded expired claim on KeyColumn [k=0x0x7800000000008080, c=0x0x02] with timestamp 2023-05-09T20:16:07.308532Z
2023-05-10 09:38:03,812 [WARN] [o.j.d.l.c.ConsistentKeyLocker.main] ::   Discarded expired claim on KeyColumn [k=0x0x7800000000008180, c=0x0x02] with timestamp 2023-05-09T20:16:08.146591Z
2023-05-10 09:38:03,812 [WARN] [o.j.d.l.c.ConsistentKeyLocker.main] ::   Discarded expired claim on KeyColumn [k=0x0x7800000000008280, c=0x0x02] with timestamp 2023-05-09T20:16:08.988543Z

...

2023-05-10 09:38:03,876 [WARN] [o.j.d.l.c.ConsistentKeyLocker.main] ::   Discarded expired claim on KeyColumn [k=0x0x7800000000018080, c=0x0x02] with timestamp 2023-05-10T14:28:09.380600Z
2023-05-10 09:38:03,876 [INFO] [o.j.d.u.BackendOperation.main] ::    Temporary exception during backend operation [CacheMutation]. Attempting backoff retry.
org.janusgraph.diskstorage.locking.consistentkey.ExpiredLockException: Expired lock on KeyColumn [k=0x0x7800000000018080, c=0x0x02]: lock timestamp 2023-05-10T14:28:09.380600Z Micros is older than storage.lock.expiry-time=PT5M
    at org.janusgraph.diskstorage.locking.consistentkey.ConsistentKeyLocker.checkSingleLock(ConsistentKeyLocker.java:472) ~[janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.diskstorage.locking.consistentkey.ConsistentKeyLocker.checkSingleLock(ConsistentKeyLocker.java:131) ~[janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.diskstorage.locking.AbstractLocker.checkLocks(AbstractLocker.java:348) ~[janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.diskstorage.locking.consistentkey.ExpectedValueCheckingTransaction.checkAllLocks(ExpectedValueCheckingTransaction.java:178) ~[janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.diskstorage.locking.consistentkey.ExpectedValueCheckingTransaction.prepareForMutations(ExpectedValueCheckingTransaction.java:157) ~[janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.diskstorage.locking.consistentkey.ExpectedValueCheckingStoreManager.mutateMany(ExpectedValueCheckingStoreManager.java:77) ~[janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:99) ~[janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction$1.call(CacheTransaction.java:96) ~[janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:66) [janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:52) [janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.persist(CacheTransaction.java:96) [janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.flushInternal(CacheTransaction.java:147) [janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.diskstorage.keycolumnvalue.cache.CacheTransaction.commit(CacheTransaction.java:210) [janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.diskstorage.BackendTransaction.commitStorage(BackendTransaction.java:147) [janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.graphdb.database.StandardJanusGraph.commit(StandardJanusGraph.java:917) [janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.commit(StandardJanusGraphTx.java:1558) [janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.janusgraph.graphdb.tinkerpop.JanusGraphBlueprintsGraph$GraphTransaction.doCommit(JanusGraphBlueprintsGraph.java:322) [janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at org.apache.tinkerpop.gremlin.structure.util.AbstractTransaction.commit(AbstractTransaction.java:104) [gremlin-core-3.6.2.jar:3.6.2]
    at org.janusgraph.graphdb.tinkerpop.JanusGraphBlueprintsGraph$GraphTransaction.commit(JanusGraphBlueprintsGraph.java:300) [janusgraph-core-1.0.0-20230504-014643.988c094.jar:?]
    at Test6.main(Test6.java:28) [test-classes/:?]
2023-05-10 09:38:03,925 [WARN] [o.j.d.l.c.ConsistentKeyLocker.main] ::   Discarded expired claim on KeyColumn [k=0x0x7800000000018080, c=0x0x02] with timestamp 2023-05-09T20:09:05.345276Z
2023-05-10 09:38:03,925 [WARN] [o.j.d.l.c.ConsistentKeyLocker.main] ::   Discarded expired claim on KeyColumn [k=0x0x7800000000018080, c=0x0x02] with timestamp 2023-05-09T20:14:34.917340Z
2023-05-10 09:38:03,925 [WARN] [o.j.d.l.c.ConsistentKeyLocker.main] ::   Discarded expired claim on KeyColumn [k=0x0x7800000000018080, c=0x0x02] with timestamp 2023-05-09T20:21:07.864503Z
2023-05-10 09:38:03,933 [WARN] [o.j.d.l.c.ConsistentKeyLocker.main] ::   Discarded expired claim on KeyColumn [k=0x0x7800000000018080, c=0x0x02] with timestamp 2023-05-10T14:28:09.380600Z

...

2023-05-10 09:38:45,394 [INFO] [o.j.d.u.BackendOperation.main] ::    Temporary exception during backend operation [CacheMutation]. Attempting backoff retry. ...

Process finished with exit code -1

Code

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.structure.Vertex;
import org.janusgraph.core.JanusGraph;
import org.janusgraph.core.JanusGraphFactory;
import org.janusgraph.core.JanusGraphVertex;
import org.janusgraph.core.PropertyKey;
import org.janusgraph.core.schema.JanusGraphManagement;

public class Test6 {
    private static final Logger logger = LogManager.getLogger(Main.class);

    public static void main(String[] args) {
        JanusGraph janusGraph = JanusGraphFactory.build().set("storage.backend", "cql").set("storage.hostname", "localhost:9042").open();
        GraphTraversalSource g = janusGraph.traversal();
        logger.info("g.V().count().next():\t" + g.V().count().next());
        g.V().drop().iterate();
        logger.info("g.V().count().next():\t" + g.V().count().next());
        JanusGraphManagement janusGraphManagement = janusGraph.openManagement();
        PropertyKey propertyKey = janusGraphManagement.getOrCreatePropertyKey("_id");
        if (!janusGraphManagement.containsGraphIndex("_id"))
            janusGraphManagement.buildIndex("_id", Vertex.class).addKey(propertyKey).buildCompositeIndex();
        janusGraphManagement.commit();
        JanusGraphVertex janusGraphVertex = janusGraph.addVertex();
        janusGraphVertex.property("test", "test1");
        janusGraph.tx().commit();
        janusGraph.close();
    }
}

Resources

  1. Maven 3.8.1
  2. Java 11.0.19 (corretto-11)
  3. JanusGraph 1.0.0-20230504-014643.988c094
  4. Windows 10
  5. Docker
  6. Cassandra:latest [container]
  7. ICIJ Offshore Dataleaks

Solution

  • JanusGraph isn't dropping them.

    Because JanusGraph's server is Cassandra; this instance is not a "JanusGraph Server"
    in-memory or local-distribution as made clear by thee Stephen Mallette

    it's true that Gremlin Server will do auto-commits on transactions but your'e not connected to Gremlin Server. you connected to JanusGraph instance directly so Gremlin Server isn't doing anything to manage transactions. Just a suggestion as you seem to be working in a realtime way and asking questions as you go, stackoverflow really isn't meant for conversational help. i'd highly recommend you bring your conversations to the JanusGraph and/or TinkerPop discord servers for this sort of support. you can find the link to discord on the project home pages. - Stephen M.