Search code examples
apache-flinkflink-streaming

advanceToEndOfEventTime flag in Flink


I am looking through quite recent API of JobClient and I see advanceToEndOfEventTime flag in method stopWithSavepoint. If I understand correctly, this will cause job to e.g. flush time-based windows. And thus, if we start with this savepoint, job will start with windows without any elements. I don't find that desirable in my current use cases - in all our applications it is crucial to restore state as it is after the restart. I wonder what's the use case for that?


Solution

  • One situation where this would be useful is in a case where you know the job is done, and there won't be any further input. If the sources are all finite, like a file, Flink will automatically advance the current watermark to Watermark.MAX_WATERMARK, and stop the job.

    But with potentially unbounded streaming sources this doesn't happen, even if they are bounded -- the job you'd like to stop just sits there, waiting for more events to process (that will never come), and holding onto the final set of results that you'd like to flush out. The advanceToEndOfEventTime option lets you cleanly shut this down.