Search code examples
apache-kafkacompressiongzipsnappylz4

How do I should choose my log compression type in apache kafka?


I want to do compress data log in Apache Kafka. How do I know which one to choose? For me, performance and space is important.

Server.properties file

Compression.type = snappy , gzip , lz4 vb. use.


Solution

  • Anecdotally, Uber uses zlib with MsgPack serialized messages. However, you should perform your own benchmarks on your own hardware, network and storage (for example, those numbers were gotten using Python libraries)

    Regarding the underlying serialization, Avro serialization via a Schema Registry allows you to have stricter schema definition rules than plaintext or JSON, and Avro generally pairs well with Snappy compression