Search code examples
apache-flinkflink-streaming

Flink Broadcast State Pattern: Avoiding inconsistency


In Flink documentation we are warned that: "Order of events in Broadcast State may differ across tasks".

How can one implement an application with a consistent broadcast state across tasks, that allows deletes or updates (and not just additions)?

If order is not guaranteed then one task might get a "delete" event before it got a "create" event, or an "update" event that sets an old version after it got an event to set a new version.

Are there any measures one can take to minimize the out-of-order risk? (e.g. parallelism = 1, single Kafka partition etc...)


Solution

  • If the broadcast stream originates from an operator with a parallelism of 1, then this isn't an issue. It's only if your topology allows for race conditions among parallel broadcast streams that you have to worry about this.