Search code examples
cassandragraph-databasestitanjanusgraph

JanusGraph query failure due to Cassandra backend tombstone exception


I have raised a GitHub issue regarding this as well. Pasting the same below.

  • JanusGraph version - janusgraph-0.3.1
  • Cassandra - cassandra:3.11.4

When we run JanusGraph with the Cassandra backend, after a period of time, the JanusGraph starts throwing the below errors and goes in to an unusable state.

JanusGraph Logs:

466489 [gremlin-server-exec-6] INFO org.janusgraph.diskstorage.util.BackendOperation - Temporary exception during backend operation [EdgeStoreKeys]. Attempting backoff retry. org.janusgraph.diskstorage.TemporaryBackendException: Temporary failure in storage backend at io.vavr.API$Match$Case0.apply(API.java:3174) at io.vavr.API$Match.of(API.java:3137) at org.janusgraph.diskstorage.cql.CQLKeyColumnValueStore.lambda$static$0 (CQLKeyColumnValueStore.java:123) at io.vavr.control.Try.getOrElseThrow(Try.java:671) at org.janusgraph.diskstorage.cql.CQLKeyColumnValueStore.getKeys (CQLKeyColumnValueStore.java:405)

Caused by: com.datastax.driver.core.exceptions.ReadFailureException: Cassandra failure during read query at consistency QUORUM (1 responses were required but only 0 replica responded, 1 failed) at com.datastax.driver.core.exceptions.ReadFailureException.copy(ReadFailureException.java:130) at com.datastax.driver.core.exceptions.ReadFailureException.copy(ReadFailureException.java:30)

Cassandra Logs:

WARN [ReadStage-2] 2019-07-19 11:40:02,980 ReadCommand.java:569 - Read 74 live rows and 100001 tombstone cells for query SELECT * FROM janusgraph.edgestore WHERE column1 >= 02 AND column1 <= 03 LIMIT 100 (see tombstone_warn_threshold)

ERROR [ReadStage-2] 2019-07-19 11:40:02,980 StorageProxy.java:1896 - Scanned over 100001 tombstones during query 'SELECT * FROM janusgraph.edgestore WHERE column1 >= 02 AND column1 <= 03 LIMIT 100' (last scanned row partion key was ((00000000002b9d88), 02)); query aborted

Related Question: Cassandra failure during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded, 1 failed)

Questions:

1) Is Edge updates are stored as a new item causing tombstones ?. (since janus is a fork of titan). How to increment Number of Visit count in Titan graph database Edge Label? https://github.com/JanusGraph/janusgraph/issues/934

2) What is the right approach towards this. ?

Any solution/indications would be really helpful.

[Update]

1) Update to the edges didn't cause tombstones in the JanusGraph.

2) Solutions: - As per the answer, reduce the gc_grace_seconds to a lower value based on the deletions of edge/vertex. - Also can consider tuning the "tombstone_failure_threshold" in cassandra.yaml based on the needs.


Solution

  • For Cassandra, a tombstone is a flag that indicates that a record should be deleted, this can be occur after a delete operation was explicitly requested, or once that the Time To Live (TTL) period expired. A record with a tombstone will persist for the time defined in the gc_grace_seconds after the delete operation was executed, by default it is 10 days.

    Usually running nodetool repair janusgraph edgestore (based on the error log provided) should be able to fix the issue. If you are still getting the error, you may need to decrease the gc_grace_seconds value of your table, as explained here.

    For more information regarding tombstones: