Both MyRocks (MySql) and Cassandra uses LSM architecture to store their data. So I have populated around 5 million rows in MySql with MyRocks as storage engine and also in Cassandra. In Cassandra it takes only 1.7 GB of disk space while in MySql with MyRocks as storage engine, it takes 19 GB.
Am I missing something? Both use the same LSM mechanism. But why do they differ in data size?
Update:
I guess it has something to do with the text column. My Table Structure is (bigint,bigint,varchar,text).
But if I remove the text column then:
Any idea about this behaviour?
Well the reason for the above behaviour is due to the rocksdb_block_size set to 4kb. Due to smaller data blocks the compressor finds lesser amount of data to compress. Setting it to 16kb solved the issue. Now I get the similar data size as of cassandra.