I am consuming a Kafka topic with more than 50 partitions using FlinkKafkaConsumer(...). I would like to create windows for these partitions. However, I don't expect any shuffling, so I can't use DataStream.keyBy(...). If I call DataStream.windowAll(...), the parallelism will be 1, which also not what I expect.
So is there any ways I can keep the high value of parallelism and no data shuffling at the same time?
Thanks
Without using keyBy, your options become rather limited. You could implement some sort of parallel windowing with a (non-keyed) ProcessFunction
, but you won't have access to timers or keyed state, just operator state.