Search code examples
cassandranosqlcassandra-3.0

How do I find the partition key of Cassandra partitions with size greater than 100MB?


I want to get a list of partition with size greater than 100 MB for analysis. How do I achieve this ?


Solution

  • Cassandra logs a WARN with details of partitions getting compacted when the partition size is larger than compaction_large_partition_warning_threshold. The default in cassandra.yaml is 100MB:

    # Log a warning when compacting partitions larger than this value
    compaction_large_partition_warning_threshold: 100MiB
    

    You can parse the system.log on the Cassandra nodes and look for log entries which contain the string Writing large partition. It looks something like:

    WARN  [CompactionExecutor:#] BigTableWriter.java:258 maybeLogLargePartitionWarning \
      Writing large partition ks_name/tbl_name:pk (###.###MiB) to sstable /path/to/.../...-big-Data.db
    

    It should be easy enough to write a shell script that would extract the table name and partition key from the logs. Cheers!