Search code examples
cassandracassandra-3.0cql3

Using default TTL columns but high number of tombstones in Cassandra


I use Cassandra 3.0.12.

And I have a cassandra Column Family, or CQL table with the following schema:

CREATE TABLE win30 (
    cust_id text,
    tid timeuuid,
    info text,
    PRIMARY KEY (cust_id , tid )
) WITH CLUSTERING ORDER BY (tid DESC) 
and compaction = {'class': 'DateTieredCompactionStrategy', 'max_sstable_age_days': 31 };

alter table win30 with default_time_to_live = '2592000';

I have set the default_time_to_live property for the entire table, but when I query the table,

select * from win30 order by tid desc limit 9999

Cassandra WARN that

Read xx live rows and xxxx tombstone for query  xxxxxx (see tombstone_warn_threshold).

According to this doc How is data deleted,

Cassandra allows you to set a default_time_to_live property for an entire table. Columns and rows marked with regular TTLs are processed as described above; but when a record exceeds the table-level TTL, Cassandra deletes it immediately, without tombstoning or compaction.

"but when a record exceeds the table-level TTL,Cassandra deletes it immediately, without tombstoning or compaction."

Why Cassandra still WARN for tombstone since I have set a default_time_to_live?

I insert data using some CQL like, without using TTL.

insert into win30 (cust_id, tid, info ) values ('123', now(), 'sometext'); 

a similar question but it does not use default_time_to_live

And it seems that I could set the unchecked_tombstone_compaction to true?

Another question, I select data with ordering the same as the CLUSTERING ORDER, why Cassandra hit so many tombstones?


Solution

  • Why Cassandra still WARN for tombstone since I have set a default_time_to_live?

    The way TTL works in Cassandra is that once the record is expired, its marked as tombstone (the same process of deletion of a record). So instead of manually having a purge job in RDBMS world, Cassandra enables you to cleanup old records based on their TTL. But it still follows through the same process as DELETE and hence the tombstone. Since your TTL value is '2592000' (30days), anything older than 30 days in the table gets expired (marked as tombstone - deleted).

    Now the reason for the warning is that your SELECT statement is looking for records that are alive (non-deleted) and the warning message is for how many tombstoned (expired / deleted) records were encountered in the process. So while trying to serve 9999 alive records, the table hit X number of tombstones along the way.

    Since the TTL is set at table level, any inserted record to this table will have a default TTL of 30days.

    Here is the documentation reference, in case you want to read more.

    After the number of seconds since the column's creation exceeds the TTL value, TTL data is considered expired and is included in results. Expired data is marked with a tombstone after on the next read on the read path, but it remains for a maximum of gc_grace_seconds.

    Above reference is from this link

    And it seems that I could set the unchecked_tombstone_compaction to true?

    Its nothing related to the warning that you are getting. You could think about reducing gc_grace_seconds value (default 10 days) to get rid of tombstones quicker. But there is a reason for this value to be 10days.

    Note that DateTieriedCompactionStrategy is depcreated and once you upgrade to 3.11 Apache Cassandra or DSE 5.1.2 there is TimeWindowCompactionStrategy which does a better job with handling tombstones.