I'm using Apache Flink v1.9.1
I have a keyed (partitioned) data stream that needs to generate some tuples of data and some control signals that need to be looped back to the generator. Except I want the control signal stream to be broadcast to all the partitioned/parallel tasks of the generator. The control signal is of a different data type than the main data stream.
Is this even possible to implement this using broadcast and iterate? If not is there any other way to achieve this?
Here's rough pseudocode of what I'm trying to do:
IterativeStream<Integer> iteration = initialkeyedStream.iterate();
DataStream<Tuple2<Integer, Long>> mainDataStream = iteration.process(/*some stuff happens here/*);
DataStream<String> sideOutputStream = mainDataStream.getSideOutput(outputTag); //control signal stream
DataStream<String> bcsideOutputStream = sideOutputStream.broadcast(); *//or should this be BroadcastStream<String> bcsideOutputStream?*
iteration.closeWith(bcsideOutputStream);
It's recommended not to use iterations for streaming, as they prevent checkpointing from being used. If you care about fault tolerance guarantees, an alternative would be to send the loopback control events to a sink, and add a source that reads from that sink. Or you might consider using the Stateful Functions API instead.
As for your original question, I don't know if it's possible to use a broadcast stream with iterations. Broadcast streams were added well after iterations were effectively abandoned, so if it does happen to work I'd consider that more-or-less accidental, and not a supported feature.