Search code examples
apache-flinkflink-streaming

What is the difference between periodic and punctuated watermarks in Apache Flink?


Will be helpful if someone give usecase example to explain the difference between each of the Watermark API with Apache flink given below

  • Periodic watermarks - AssignerWithPeriodicWatermarks[T]
  • Punctuated Watermarks - AssignerWithPunctuatedWatermarks[T]

Solution

  • The main difference between the two types of watermark is how/when the getWatermark method is called.

    periodic watermark

    With periodic watermarks, Flink calls getCurrentWatermark() at regular interval, independently of the stream of events. This interval is defined using

    ExecutionConfig.setAutoWatermarkInterval(millis)
    

    Use this class when your watermarks depend (even partially) on the processing time, or when you need watermarks to be emitted even when no event/elements has been received for a while.

    punctuated watermarks

    With punctuated watermarks, Flink calls checkAndGetWatermark() on each new event, i.e. right after calling assignWatermark(). An actual watermark is emitted only if checkAndGetWatermark returns a non-null value which is greater than the last watermark.

    This means that if you don't receive any new element for a while, no watermark can be emitted.

    Use this class if certain special elements act as markers that signify event time progress, and when you want to emit watermarks specifically at certain events. For example, you could have flags in your incoming stream marking the end of a sequence.