Search code examples
cassandracql

Cassandra BATCH INSERT inserts nothing if same data has been BATCH INSERTED and DELTED


I have a strange issue with Cassandra BATCH INSERT. I'm on local Cassandra, which just have replication factor as 1. The scenario is this:

  1. Use BATCH to INSERT a record: BEGIN BATCH USING TIMESTAMP 16783871583 INSERT INTO test.fruit (lastUpdateTime, name, color) VALUES (1678387158324,'apple','red');APPLY BATCH; After this, the record is inserted correctly.
  2. DELETE the record: DELETE from test.fruit where name = 'apple' and color = 'red'; After this, the record is deleted correctly.
  3. Run the BATCH statement again in step1. After this, there is no record inserted. And no matter how many times I run it, and how much time I wait. The record is still not inserted.

If I just use INSERT without the BATCH, everything works fine.

Any idea what is the reason?


Solution

  • The reason is that you explicitly specify the timestamp for your batch, and the value that you provide (16783871583) corresponds to Tue Jul 14 1970 06:11:11. When you remove this record, Cassandra puts a tombstone record that has the timestamp corresponding to the 10th of March 2023rd. When you execute batch again (with the timestamp from 1970th), the record is inserted, but when you read, this record is shadowed by the tombstone record with higher timestamp and no data is returned as it's marked as deleted.

    Really, explicitly specifying timestamps is kind of advanced functionality and you need to 100% understand how Cassandra works to use them.