I need some grace period before consuming the kafka message.
My approach is to use a hopping window.
e.g. If I want to consume the message after 5 minutes, the hopping window would be 6 minutes and will advance by 1 minute.
Then I'll use a filter to get data older than 5 minutes (there's also a timestamp in the message itself). Hence I will process data from minute 0 to minute 1. Then the hopping window jumps 1 minute forward and I process data from minute 1 to minute 2 and so on.
However I need to consume all messages when starting the application and not just the last 6 minutes. I'm also open for other suggestions, regarding the 5 minute grace period.
I've made wrong assumptions here. All the data in the topic will be consumed, no matter how old it is.
e.g. It's 12:10 now and we start the Kafka-Stream.
The data in the topic, we want to consume, was pushed at 12:00 and we have a window of 6 minutes.
I was expecting everything to be consumed from 12:04 to 12:10 (6 minutes) and everything ago would be lost.
But the 12:00 data will be consumed anyway, it just falls into an older window.