Search code examples
apache-flinkflink-streaming

In which scenario BroadcastConnectedStream in flink is really helpful?


In which scenario BroadcastConnectedStream in flink is really helpful?

A small example with clarification would be helpful.


Solution

  • I've written some examples that you can find here:

    1. https://github.com/ververica/flink-training-exercises/blob/master/src/main/java/com/ververica/flinktraining/examples/datastream_java/broadcast/BroadcastState.java
    2. https://training.ververica.com/exercises/nearestTaxi.html
    3. https://training.ververica.com/exercises/ongoingRides.html
    4. https://training.ververica.com/exercises/taxiQuery.html

    In general, broadcast state is useful whenever you need to communicate something throughout the entire cluster. Most data sources are going to be partitioned, so that they can be processed in parallel by separate instances -- but some information is needed globally, like currency exchange rates, or thresholds, or machine learning models. If that globally useful data is static, you can simply load it from a file, but if it needs to be updated dynamically during runtime, then using a broadcast stream makes sense.