Search code examples
apache-kafkaapache-kafka-streams

Idling replicas of Kafka Streaming applications


My stateless Kafka Streams application reads messages from 12 topics and pushes data to the output topic after a simple transformation. Each "input" and "output" topic has 10 partitions. In other words, I have 120 input partitions and 10 output. Due to a high lag, the application was autoscaled to 14 replicas, but input partitions were assigned only to 10 replicas (12 per instance), and 4 others were just idle.

Despite my application being stateless, I tried to play with acceptable.recovery.lag and set it to 9223372036854775807, but that did not help.

I also tried to remove replicas to initiate rebalancing manually and tried to restart the application, but every time the result was the same.

Can someone give me a piece of advice on what's going on? Why 4 replicas idle? Could it be related to the number of output partitions somehow?

UPD: I have Kafka version: 3.3.2


Solution

  • What you're seeing is expected behavior. When you create a KStream from multiple topics, Kafka Streams will create the number of tasks equal to the max number of partitions across all input topics. So in your case, I'm assuming you have something like this:

    KStream myStream = builder.stream(topic1, topic2,...,topic10)
    

    Since the max number of partitions across all of them is 10, Kafka Streams creates 10 tasks - which equates to 10 instances that can receive a task assignment - any more than 10 will be idle - which is exactly what you're seeing here.

    I can think of 2 possible workarounds, but fair warning, I haven't tried them in a realistic setting. First, you could try to create 10 KStream objects, one for each topic, that have the same topology for example:

    KStream stream = builder.stream(topic1);
    ....
    KStream stream10 = builder.stream(topic10)
    

    The other option would be to create a repartition. Via the Repartitioned object, you can specify a higher number of partitions for the repartition topic, which should give the ability to add more instances than 10.

    HTH