Search code examples
cassandracassandra-2.0cassandra-3.0

Why do Tombstones affect read performance but not updates?


From the articles I read they say that tombstones affect read performance in Cassandra. I’m reading how data is updated in Cassandra and looks like data is written with a timestamp without modifying or reading the current data.

So when a read is performed before compaction is done, filtering needs to be done to take the latest value right? If that’s the case aren’t tombstones the same thing and why do they affect performance negatively but not updates to a row?


Solution

  • In Cassandra, update is a mutation, like, insert and delete, and except the use case of LWTs and some of the list operations, all mutations are just append to the memtable/commit log, without reading the data on the disk. So they are very fast - no checks are performed.

    Read operation, in contrast to that, need to get all versions of the data from the disk/memtable, and then create an actual version of the data based on the timestamps. And for tombstones, we need to keep them in the memory, because we may read some data from the disk that could have older timestamp, and we need to detect this.