Search code examples
cassandradatastaxcql

Cassandra simple primary key queries


We would like to create a Cassandra table with Simple Primary Key that is consisted of UUID column. The table will look like:
CREATE TABLE simple_table( id UUID PRIMARY KEY, col1 text, col2 text, col3 UUID );

This table will potentially store few billions of rows, and the rows should expire after some time (few months) using the TTL feature. I have few questions regarding the efficiency of this table:

  1. What is the efficiency of a query against this table using the primary key? Meaning, how Cassandra finds a specific row after resolving in which partition it resides?
  2. Considering that the rows will expire and create many tombstones, how does this will effect the reads and writes to this table? Let's say that we expire the data after 180 days, if I am not mistaken, the ratio of tombstones would be 10/180~=0.056 (when 10 is the gc_grace_periods in days).

Solution

  • After reading the blog (and the comments) that @Alex referred me to, I concluded that tombstones are created for expired rows due to default_time_to_live of the table. Those tombstones will be cleaned only after gc_grace_periods have passed. See this stack overflow question.

    Regarding my first questions this datastax page describes it pretty well.