Search code examples
javaapache-kafka-streamsconfluent-cloud

Can I compress intermediate topic (state store) used by KafkaStream in Kafka via producer settings?


I am working with a kafka cluster hosted in Confluent Cloud.

As per https://docs.confluent.io/cloud/current/client-apps/optimizing/throughput.html#compression, Confluent Cloud enforces compression.type=producer.

I want to compress both the internal *-changelog and output topics.

This question is similar to Can I compress intermediate topic (state store) used by KafkaStream in Kafka, but the accepted solution is not valid for me, as the topic setting can't be changed in Confluent Cloud.

I have tried to set producer.compression.type=lz4 in my KStreams app, but I don't see a significant change in the bytes retained in the topic.

The setting seemed to work when I tried with a local docker broker (the topics on the file system were smaller), but it doesn't seem to make a difference in the retained_bytes metric available from Confluent Cloud.

Is there a good way to validate that this is effectively compressing the messages?


Solution

  • So, it turns out setting compression.type=producer is enough for compressing the *-changelog topics. The amount of data in our testing environment was too small for the retained_bytes stats to show the compression, but it became apparent when we applied on an environment with more volume.