apache-kafka kafka-consumer-api kafka-producer-api

kafka topic creation faling when 3 brokers are up out of 4 in a cluster

Kafka topic creation is failing in below scenarios:

Node is kafka cluster: 4

Replication factor: 4

Number of nodes up and running in cluster: 3

Below is the error:

./kafka-topics.sh --zookeeper :2181 --create --topic test_1 --partitions 1 --replication-factor 4
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
Error while executing topic command : Replication factor: 4 larger than available brokers: 3.
[2018-10-31 11:58:13,084] ERROR org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 4 larger than available brokers: 3.

Is it a valid behavior or some known issue in kafka?

If all the nodes in a cluster should be up and running always then what about failure tolerance?

upating json file for increasing the replication factor for already created topic:

$cat /tmp/increase-replication-factor.json
{"version":1,
  "partitions":[
     {"topic":"vHost_v81drv4","partition":0,"replicas":[4,1,2,3]},
     {"topic":"vHost_v81drv4","partition":1,"replicas":[4,1,2,3]},
     {"topic":"vHost_v81drv4","partition":2,"replicas":[4,1,2,3]},
     {"topic":"vHost_v81drv4","partition":3,"replicas":[4,1,2,3]}
     {"topic":"vHost_v81drv4","partition":4,"replicas":[4,1,2,3]},
     {"topic":"vHost_v81drv4","partition":5,"replicas":[4,1,2,3]},
     {"topic":"vHost_v81drv4","partition":6,"replicas":[4,1,2,3]},
     {"topic":"vHost_v81drv4","partition":7,"replicas":[4,1,2,3]}
]}

Solution

When a new topic is created in Kafka, it is replicated N=replication-factor times across your brokers. Since you have 3 brokers up and running and replication-factor set to 4 the topic cannot be replicated 4 times and thus you get an error.

When creating a new topic you either need to ensure that all of your 4 brokers are up and running or set the replication factor to a smaller value in order to avoid failure on topic creation when one of your brokers is down.

In case you want to create topic with replication factor set to 4 while one broker is down, you can initially create the topic with replication-factor=3 and once your 4th broker is up and running you can modify the configuration of that topic and increase its replication factor by following the steps below (assuming you have a topic example with 4 partitions):

Create a increase-replication-factor.json file with this content:

{"version":1,
  "partitions":[
     {"topic":"example","partition":0,"replicas":[0,1,2,3]},
     {"topic":"example","partition":1,"replicas":[0,1,2,3]},
     {"topic":"example","partition":2,"replicas":[0,1,2,3]},
     {"topic":"example","partition":3,"replicas":[0,1,2,3]}
]}

Then execute the following command:

kafka-reassign-partitions --zookeeper localhost:2181 --reassignment-json-file increase-replication-factor.json --execute

And finally you'd be able to confirm that your topic is replicated across the 4 brokers:

kafka-topics --zookeeper localhost:2181 --topic signals --describe
Topic:signals   PartitionCount:4    ReplicationFactor:4 Configs:retention.ms=1000000000
Topic: signals  Partition: 0    Leader: 2   Replicas: 0,1,2,3 Isr: 2,0,1,3
Topic: signals  Partition: 1    Leader: 2   Replicas: 0,1,2,3 Isr: 2,0,1,3
Topic: signals  Partition: 2    Leader: 2   Replicas: 0,1,2,3 Isr: 2,0,1,3
Topic: signals  Partition: 3    Leader: 2   Replicas: 0,1,2,3 Isr: 2,0,1,3

Regarding high availability let me explain how Kafka works:

Every topic, is a particular stream of data (similar to a table in a database). Topics, are split into partitions (as many as you like) where each message within a partition gets an incremental id, known as offset as shown below.

Partition 0:

+---+---+---+-----+
| 0 | 1 | 2 | ... |
+---+---+---+-----+

Partition 1:

+---+---+---+---+----+
| 0 | 1 | 2 | 3 | .. |
+---+---+---+---+----+

Now a Kafka cluster is composed of multiple brokers. Each broker is identified with an ID and can contain certain topic partitions.

Example of 2 topics (each having 3 and 2 partitions respectively):

Broker 1:

+-------------------+
|      Topic 1      |
|    Partition 0    |
|                   |
|                   |
|     Topic 2       |
|   Partition 1     |
+-------------------+

Broker 2:

+-------------------+
|      Topic 1      |
|    Partition 2    |
|                   |
|                   |
|     Topic 2       |
|   Partition 0     |
+-------------------+

Broker 3:

+-------------------+
|      Topic 1      |
|    Partition 1    |
|                   |
|                   |
|                   |
|                   |
+-------------------+

Note that data is distributed (and Broker 3 doesn't hold any data of topic 2).

Topics, should have a replication-factor > 1 (usually 2 or 3) so that when a broker is down, another one can serve the data of a topic. For instance, assume that we have a topic with 2 partitions with a replication-factor set to 2 as shown below:

Broker 1:

+-------------------+
|      Topic 1      |
|    Partition 0    |
|                   |
|                   |
|                   |
|                   |
+-------------------+

Broker 2:

+-------------------+
|      Topic 1      |
|    Partition 0    |
|                   |
|                   |
|     Topic 1       |
|   Partition 1     |
+-------------------+

Broker 3:

+-------------------+
|      Topic 1      |
|    Partition 1    |
|                   |
|                   |
|                   |
|                   |
+-------------------+

Now assume that Broker 2 has failed. Broker 1 and 3 can still serve the data for topic 1. So a replication-factor of 3 is always a good idea since it allows for one broker to be taken down for maintenance purposes and also for another one to be taken down unexpectedly. Therefore, Apache-Kafka offers strong durability and fault tolerance guarantees.

Note about Leaders: At any time, only one broker can be a leader of a partition and only that leader can receive and serve data for that partition. The remaining brokers will just synchronize the data (in-sync replicas). Also note that when the replication-factor is set to 1, the leader cannot be moved elsewhere when a broker fails. In general, when all replicas of a partition fail or go offline, the leader will automatically be set to -1.