Search code examples
javaapache-kafkaspring-cloudpartitioningspring-cloud-stream

Spring Cloud Stream + Kafka Binder: What is the default partition key extractor strategy or partition key?


I'm looking to implement a scenario where consumer order does not matter, and want to publish to multiple partitions.

In this scenario, what would the strategy be used to select a partition if partition-key-expression is not specified in the producer definition? i.e with this config I see the messages being published to all 4 topics, but it is not clear what mechanism is being used:

spring:
  cloud:
    stream:
      bindings:
        outbound:
          destination: mytopic
          group: app
          producer:
            partitioned: true
            partition-count: 4

Solution

  • Binder level partitioning is really not needed with the Kafka binder; it is intended for binders where the underlying technology does not have native partitioning.

    There is no need to set the partitioned=true property.

    Kafka's default strategy is used

    https://kafka.apache.org/documentation/#producerconfigs_partitioner.class

    If there is no partitioner.class specified...

    If not set, the default partitioning logic is used. This strategy will try sticking to a partition until batch.size bytes is produced to the partition.

    It works with the strategy:

    If no partition is specified but a key is present, choose a partition based on a hash of the key

    If no partition or key is present, choose the sticky partition that changes when batch.size bytes are produced to the partition.