Search code examples
cassandracockroachdbscyllapaxos

Performance of LWT in scylla/cassandra


Wanted to understand the why do we say the throughput of whole Cassandra/scylla/cockroachDB setup is dependent on LWT/linearised TX.

LWT will affect throughput of a hot key (index or row with a lot of queries), I do understand we can't increase throughput of a key using LWT in cassandra/scylla or any DB by horizontally scaling. My question is why can't we increase throughput of whole cluster by increasing the nodes in a cluster.

Adding more points on the question, if for an index paxos/raft is blocking an index and affecting throughput of a hot index/key, why should other keys get affected (why should whole throughput of cluster gets affected).

If whole throughput of cluster is getting affected in LWT/linearised TX, then what's the point of horizontally scalable database, does the same thing happens for cockroachDB?

Is there some benchmark done for throughput of a single key in LWT/linearised Tx for scylla/cassandra/cockroachDB?

https://www.youtube.com/watch?v=L9cO9OYhOtU

Why this video shows throughput being constant for LWT, why number of nodes in not taken into consideration as a variable for benchmarking?

Expectation: Throughput of a scylla/cassandra/cockroachDB cluster in LWT/linearised Tx should scale by horizontal scaling nodes in a cluster.


Solution

  • Currently in ScyllaDB (I don't know about the others), LWT is serialized at the partition level. This means that requests that go to different partitions are handled in parallel, and scaling the cluster scales the number of LWT requests you can do to different partitions exactly as you would expect.

    Currently, ScyllaDB needlessly serializes LWTs that go to different rows in the same partition. This is considered a bug and is planned to be fixed (see #6399). But in any case, if there is no hot partition that a majority of updates go to rows in it, that won't cause a scalability problem.