Search code examples
apache-flink

Flink BroadcastProcessFunction vs "Broadcast Variables"


In flink app I need to pass a complex object to all downstream operators. I have found two things for it:

  • BroadcastProcessFunction
  • "Broadcast Variables"

Could you advise when I should choose one of them and when?


Solution

  • Broadcast variables were about sharing static configuration information during system initialization when doing batch processing with the now defunct DataSet API.

    A BroadcastProcessFunction is used to process a stream of updates to broadcast state; this is part of the DataStream API. See the docs on The Broadcast State Pattern for more info.

    A BroadcastProcessFunction or KeyedBroadcastProcessFunction is useful when you want to make a relatively small set of data available to all instances of a process function -- i.e., rules used by a rules engine, or foreign currency exchange rates.