Search code examples
apache-kafkaapache-kafka-streams

Difference between num.standby.replicas vs max.warmup.replicas


I am bit confused to understand the difference between num.standby.replicas and max.warmup.replicas. Both sound same to me as both are helping to reduce the time taken in getting a standby task and it's state store ready to be promoted as active while a consumer group rebalancing is happening. Thanks in advance.


Solution

  • num.standbys is a per-task setting applies for HA; if you lose a task, Kafka Streams migrates the standby task to the active task immediately.

    For max.warmup.replicas is a "global" setting, and it only applies to the case where you are scaling out, adding a Kafka Streams instance with the same application-id.

    In the scale-out scenario, with a max.warmup.replicas=1, Kafka Streams would "warm up" a single task A by starting A' on the new node, and when A' is up to the acceptable lag setting, task A will migrate to the new node (A' -> A) then the process will repeat for another task if you set max.warmup.replicas=2, then Kafka Streams will warmup two tasks A and B, etc.