This behavior in Cassandra seems undocumented and counterintuitive. I want to know why this is happening and how to prevent such things.
Create a test table.
CREATE TABLE test_table (id text PRIMARY KEY, foo text);
Now create a row in the table with USING TIMESTAMP
.
INSERT INTO test_table (id, foo)
VALUES ('first', 'hello')
USING TIMESTAMP 1566912993048082;
The result is
id | foo | writetime(foo)
-------+-------+------------------
first | hello | 1566912993048082
Now let's update the row using the same timestamp.
INSERT INTO test_table (id, foo)
VALUES ('first', 'hello2')
USING TIMESTAMP 1566912993048082;
Everything works fine.
id | foo | writetime(foo)
-------+--------+------------------
first | hello2 | 1566912993048082
Let's update the row again using the same timestamp.
INSERT INTO test_table (id, foo)
VALUES ('first', 'hello1')
USING TIMESTAMP 1566912993048082;
!!! Nothing changed.
id | foo | writetime(foo)
-------+--------+------------------
first | hello2 | 1566912993048082
Update the same row again.
INSERT INTO test_table (id, foo)
VALUES ('first', 'hello3')
USING TIMESTAMP 1566912993048082;
!!! Works again.
id | foo | writetime(foo)
-------+--------+------------------
first | hello3 | 1566912993048082
It seems like an update happens only in cases when old.foo < new.foo
using the same timestamp.
Expected results:
Actual result:
FYI,
I opened a ticket to get the answers to your question. Here is the response for others that may try this. Again, in a typical situation, one wouldn't do what you're doing.
---- Response ----
As you are aware, DSE/Cassandra handles the conflicts via the write timestamp where the latest always wins. In the event of the tie as detailed in your thought experiment, there are actually two scenarios that need to be handled.
Live cell colliding with tombstone In this situation the tombstone will always win. There is no way to know if that is what the client expects, but the behavior will be consistent.
Live cell colliding with another live cell Similar to the tombstone situation, we have no way of knowing which cell should be returned. In order to provide consistency, when the write timestamps are the same, the larger value wins.