Flink: Difference between MaxOutOfOrderness and AllowedLateness

In Flink there are 2 things which provide similar behaviour. What is the difference between the two.

MaxOutOfOrderness: Used with BoundedOutOfOrdernessTimestampExtractor. Allows elements of a stream to be out of order by a magnitude of maxOutOfOrdeness value by delaying the Watermark behind Event time by MaxOutOfOrderness value.
AllowedLateness: Keeps a window state for some more time defined by this parameter.

Why would you want to use AllowedLateness when you can already achieve the same behaviour by maxOutOfOrderness.

If you only use Allowedlateness then, there is no point of waiting as the late elements will be out of order and hence will be dropped.

If you only use MaxOutOfOrderness, then it will delay the window computation but it can handle out of order events.

Solution

MaxOutOfOrderness determines how far the watermark for a stream lags behind the maximum timestamp that's been observed so far -- which in turn determines when any event time timers will fire. These timers might belong to windows, or process functions.

The watermark also defines which events are late -- events with timestamps less than the current watermark are late.

The window API has a notion of allowed lateness, which determines how long the window state is retained. An event time window will fire when the watermark passes the window's end point -- and in the case where there is some allowed lateness, then the window will fire again as each late event arrives, up until the allowed lateness expires (this firing behavior can be customized -- this is the default). Once the allowed lateness expires the window's state is purged, and thereafter late events are either dropped or sent to a side output (if one is configured).

So, to sum up:

AllowedLateness only applies to event time windows, while MaxOutOfOrderness pertains to all uses of watermarks (e.g., process functions).

It is useful to have both mechanisms because you can have windows that fire both at the natural window ending as defined by the watermarks, and again with updated results as late events arrive.

The purpose of watermarks is to give control over the tradeoff between latency and completeness. It's useful to be able to demand that results be produced with low latency (i.e. a relatively short MaxOutOfOrderness) while at the same time accommodating events that are quite late.

And to correct one thing: if MaxOutOfOrderness is zero and there is AllowedLateness, then you will probably have a lot of late events (unless everything is in order), but they will only be dropped (by a window) if their lateness exceeds the allowed lateness.