I have two streams, stream1 and stream2. Streams are coming from Kafka Topics. Stream1 is a KeyedStream which contains the main data that I want to process in keyed-context. Stream2 is simply a "trigger" stream. What I mean is that Stream1 will print the results it has generated so far only when a trigger arrives from stream2. My problem is that I cant find a way to broadcast the trigger event of stream2 to all partitions of stream1, so they all start printing the results. Any ideas on how to do this? Note: I dont want to use Windows.
Moreover, in flink documentation I see an operator "broadcast" : Datastream -> DataStream but I haven't found any examples to understand how this is used. Can someone explain?
Thank you in advance
See The Broadcast State Pattern page, which has an example of doing something similar to your use case. To summarize...
stream1
element is received by processElement()
, you save it in (keyed) state.processBroadcastElement
, you get use the ctx.applyToKeyedState(StateDescriptor<S, VS> stateDescriptor, KeyedStateFunction<KS, S> function)
method to access/emit all of the records you've saved in state for stream1
.