Search code examples
cassandradatastax

How does the cassandra know that it has completed QUORUM?


I have always used Cassandra in spark applications, but I never wondered how it works internally. Reading the Cassandra documentation I got a small doubt (which may be a beginner's doubt). I read in a book (Cassandra The Definitive Guide) and in the official Cassandra documentation that the formula would be: (RF / 2) + 1.

So theoretically if I have a cluster with 6 nodes, and a replication factor of 3, I would only need response from 2 nodes.

And here come the small doubts: 1 - What would this response be? (The query return with the data?) 2 - If there was no data with the filters used in the query, is the empty return considered a response? 3 - And last but not least, if the empty return is considered a response, if these two nodes that complete the QUORUM don't have the replica data yet, my application that did the SELECT will understand that this data doesn't exist in the database, right?


Solution

  • 1 - What would this response be? (The query return with the data?)

    The coordinator node will wait for 2 replicas of the 3 (because CL=QUORUM) to respond to the query (with the request results). It will then send the response to the client.

    2 - If there was no data with the filters used in the query, is the empty return considered a response?

    Yes, the empty response will be sufficient and will be considered a valid response. Note that there is a mechanism last-write-wins (based on row write time) used in case of conflict.

    3 - And last but not least, if the empty return is considered a response, if these two nodes that complete the QUORUM don't have the replica data yet, my application that did the SELECT will understand that this data doesn't exist in the database, right?

    You have to understand that Apache Cassandra uses eventual consistency meaning that the client will decide on the desired CL. If you have a strong consistency, meaning you have an overlap of the write CL and read CL (Write CL + Read CL > RF), then will always retrieve the last data. I recommend you to watch this video: https://www.youtube.com/watch?v=Gx-pmH-b5mI