Search code examples
cassandra

Which Cassandra partitioner is better: Random or Murmur3 (in terms of throughput) and what is the difference between them?


What difference the choice of partitioners could bring in my Cassandra throughput and latency? I have gone through all three partitioners and one thing I noticed is that ByteOrdered partitioner has overhead so I do not use it. Now I am a bit split between Random and Murmur3 partitioners.


Solution

  • The main difference between the two, is in how each generates the token hash values. The Random partitioner used the JDK native MD5 hash (because it was both convenient for the developers and standard across all JDKs). But since Cassandra really doesn't need a cryptographic hash, that function took much longer than it needed to.

    With the Murmur3 partitioner, the token hashing does only what Cassandra needs it to do. Which, is to generate a token ensuring even distribution across the nodes. This results in an improvement of 3 to 5 times in token hashing performance, which untimately translates into the overall 10% gain that Carlo mentioned above.

    It should also be noted that DataStax warns that the partitioners are not compatible. Which means, that once you start with one partitioner, you cannot (easily) convert to the other. Therefore, I would pick the newer, slightly faster Murmur3 partitioner.