Just to make the scenario simple.
number of consumers == number of partitions == Kafka broker numbers
If deploy the consumers on the same machines where the brokers are, how to make each consumer only consume the messages locally? The purpose is to cut all the network overhead.
I think we can make it if each consumer can know the partition_id on their machines, but I don't know how? or is there other directions to solve this problem?
Thanks.
bin/kafka-topics.sh --zookeeper [zk address] --describe --topic [topic_name]
tells you which broker hosts the leader for each partition. Then you can use manual partition assignment for each consumer to make sure it consumes from a local partition.