Search code examples
apache-flinkflink-streaming

when will flink consider a checkpoint complete? before or after the sink function?


when will flink consider a checkpoint complete? There are two ways:

  1. flink will consider checkpoint N complete as soon as all sink functions have received check barrier N.
  2. flink will consider checkpoint N complete when all sink functions have processed barrier N successfully.

Which one is true? cause I can find any documentation about this.


Solution

  • Once all of the tasks (including the sinks) have reported back to the checkpoint coordinator that they have finishing writing out a snapshot of their state, the checkpoint coordinator will write the checkpoint metadata, and then notify all participants that the checkpoint is complete. So #2 is correct -- a checkpoint is not complete until after the sink functions have processed the barrier.

    For sinks doing two phase commits, the complete story is somewhat more complex. See https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html for the details.