Search code examples
loggingcassandradatastaxdatastax-java-driverspring-data-cassandra

Cassandra query logging: data size


The Datastax QueryLogger (i.e. Cassandra query logging through spring configuration) outputs good information about query timing.

DEBUG c.d.driver.core.QueryLogger.NORMAL - [cluster1] [localhost/127.0.0.1:9042] Query completed normally, took 100 ms: SELECT * FROM my_table;

In addition to the speed of the query, I'm also interested in the size of the payload. Is there a way to log the amount of data retrieved? Something like this?

Query completed normally, took 100 ms: SELECT * FROM my_table returned 5MB;


Solution

  • It's complicated. First, you need to define what you mean by "size of the payload".

    If you want the size of the encoded values in a statement, IOW, the size of the request once serialized to the wire, then you could have a look at the Java driver's Statement.computeSizeInBytes method. But beware that this is driver 4.x, but you seem to be using driver 3.x.

    If you want the total size of the mutation once it gets written to the disk, it's trickier. Cassandra does have an internal utility called org.apache.cassandra.db.IMutation.dataSize(); but it's hard to reproduce this algorithm outside of the coordinator node. DataStax Bulk Loader has a utility that tries its best to do that: DataSizes. Feel free to reuse that logic in your own code.

    And lastly, you would have to modify the query logging code to append the data size to the logged message. The driver doesn't do that by default.