Search code examples
apache-kafkarhel

kafka + what are the advantages of more kafka machines in the cluster


We have the following options in order to design new Kafka production cluster ( based on OS - RHEL 7.9 version )

  1. 5 Kafka brokers with Usable storage per broker: 96,000 GB ( 96 TB )

  2. 18 Kafka brokers with Usable storage per broker: 20,000 GB ( 20 TB )

We can see that the second option is more expensive - 18 machines ( when disk Kafka storage is 20TB ) ,

When the first option is only with 5 machines but disk Kafka storage is 96T

But we are wondering what is the best practice from Kafka performance side ?

is it better to be with 18 Kafka machines with less storage ?

Or to be with 5 Kafka machines with 96TB storage per machine

What are the advantages and disadvantages on options 1 and 2


Solution

  • I don't understand why those are the only two options. Can't you have like 7 or 9 brokers? At that scale, the amount of storage per broker shouldn't matter.

    More brokers is "better" in order to have higher network throughput and replication, so storage is not the primary factor.

    3 is a good minimum for high availability, but then you cannot lose any, even for downtime during upgrades. Therefore 4 or 5 would be best, at a minimum.

    On the other side, 18 is rather high for one cluster without knowing your use case (you can always expand a cluster later)... Instead, you can take those machines and divide it into 2 or many "failover"/active-active clusters, or use it for some alternative use cases like metrics, logs, rather than strictly application events, or more strict security policies, or dev/prod clusters, etc, etc