I want to process messages present in a Kafka topic using Kafka streams.
The last step of the processing is to put the result in a database table. To avoid database contention related issues(the program is going to run 24*7 and process millions of messages), I will be using batching for JDBC calls.
But in this case, there is a possibility of messages getting lost(in a scenario, I read 500 messages from a topic, streams will mark offset, now the program fails. Messages present in JDBC batch update are lost but the offset is marked for those messages).
I want to manually mark the offset of the last message once the database insert/update is complete, but it is not possible according to the following question: How to commit manually with Kafka Stream?.
Can someone please suggest any possible solution
Kafka Stream doesn't support manual commit, and at the same time it doesn't support batch processing as well. With respect to your use case, there are few possibilities:
Use Normal consumer and implement batch processing and control manual offset.
Use Spark Kafka Structured stream as per below Kafka Spark Structured Stream
Try Spring Kafka [Spring Kafka]2
In this kind of scenario there are possibilities to consider JDBC Kafka Connector as well. Kafka JDBC Connector