Search code examples
flink-streaming

Union of bounded and unbounded streams in flink


Can we use union to combine a bounded stream from api with unbounded stream from kafka in flink? Would it cause any issues.


Solution

  • In general I would expect this to work. Note that when the unbounded stream terminates, the source will generate a final watermark with the value MAX_WATERMARK. This prevents bounded source data from being buffered in state indefinitely, since there won't be any subsequent watermarks.

    Also if the streaming & bounded sources have differing parallelisms, there will be a rebalance as a result of the union. If you're expecting data to not get shuffled (from either source), this can mess up that expectation.