Search code examples
cassandracql3md5sumcassandra-3.0

Will Cassandra avoid calculating the row MD5 if the value already is an MD5?


From various documents about Cassandra, it clearly says that it converts row keys to an MD5 before saving them in the database.

If my row keys already are MD5 sums, is there a way to let Cassandra know and thus avoid having it calculate the MD5 of that MD5?

P.S. The table I am talking about has files in it and the keys are the files MD5 sums.


Solution

  • What Cassandra actually does is hash the partition key based on what the partitioner defines. The original partitioner was MD5, but modern versions of Cassandra default to Murmur3 (not QUITE murmur3, but basically murmur3).

    In either case, yes, Cassandra hashes the partition key, because there is no way to let Cassandra know that it's already an MD5.

    If you really want to avoid the hashing, you can look at other alternative partitioners (such as byte ordered or order preserving ), or write your own that implements IPartitioner . Note, though, that if you do use a different partitioner, it's used for all tables/keyspaces in the cluster.