Search code examples
javaapache-kafkakafka-producer-api

Callback method vs get method in Kafka producer


To get the produced record details , we have two options to choose from

  1. onCompletion() - callback function
  2. get() method

Could someone please explain what is the difference between them and how to use them in details please? (JAVA)

NOTE : Producer properties which I'm using is mostly default (ex:batch.size,acks,max.block.ms...)


Solution

  • onCompletion is the async way of producing Kafka data and the loop with a get will be a sync way of writing data in Kafka.

    The producer in Kafka writes data on the topic at very high throughput. If you use the sync get function in the producer code, after every write the producer needs to wait for the ack from Kafka. This throttles the producer throughput. The producer needs to wait for Kafka to store the data, replicate (based on how it is configured) and then give back the ack to the producer on successful write.

    The alternative is onCompletion here the producer will keep on producing the data without waiting for ack from Kafka. Kafka will call the callback onCompletion if the write is successful. The producer needs to keep track of these onCompletion calls and if things fail, it needs to retry.

    What generally producers do is send a batch of N records to Kafka and then wait for all the completion events and then send the next N records. This is something like the TCP sliding window flow control paradigm.

    It is difficult to suggest what you should do. The downside of working with onCompletion and use a retry from there is that jeopardizes the ordering of the records in Kafka.

    The producer may have sent 1..65 records successfully then Kafka missed 65-72 and then Kafka wrote 73..99. Once the Kafka completed writing 99, the producer may get the 66 , 67 as onCompletion (since it is an async callback, it can come anytime) call back and retry that. This essentially makes the record ordering jumbled up.

    In those cases, the consumer needs to understand that all the writes may not be ordered.

    My suggestion would be to use onCompletion for a batch of records. Generally, applications don't have very strict ordering requirements. So you could leverage the async nature of the call and improve throughput.