Search code examples
javamessage-queueapache-kafkaproducer-consumerkafka-producer-api

Kafka cluster zookeeper failure handling


I am going to implement a kafka cluster consisting of 3 machines, one for zookeeper and other 2 as brokers. I have about 6 consumer machines and about hundred producers.

Now if one of the broker fails data loss is avoided thanks to replication feature. But what if zookeeper fails and the same machine cannot be started? I have several questions:

  1. I noticed that even after zookeeper failure producers continued to push messages in designated broker. But they could no longer be retrieved by consumers. Because Consumers got unregistered. So in this case is data lost permanently?
  2. How to change zookeeper ip in broker config in run time? Will they have to be shutdown to change zookeeper ip?
  3. Even if new zookeeper machine is somehow brought into the cluster previous would the previous data be lost?

Solution

  • Running only one instance of Zookeeper is not fault-tolerant and the behavior cannot be predicted. According to HBase reference, you should setup an ensemble with at least 3 servers.

    Have a look at the official documentation page: Zookeeper clustered setup.