Search code examples
apache-flink

When Flink source operator is parallelism, is the input order of a single partition assured?


I know that one message from a Topic can only be consumed by one consumer from one Group, and in a consumer Group, each consumer is responsible for one or more partition. If I have 4 partitions for a Topic and to get the data I use env.addSource(FlinkKafkaConsumer).setParallism(4), will it actually creates 4 consumer instance? If not, how to assure the order of messages when 4 consumer sharing the same partition?


Solution

  • For your first question: "If I have 4 partitions for a Topic and to get the data I use env.addSource(FlinkKafkaConsumer).setParallism(4), will it actually creates 4 consumer instance?"

    Yes, flink will create 4 instances.

    For you second question: "How to assure the order of messages when 4 consumer sharing the same partition?"

    Only one consumer will receive messages, the other ones will be idle doing nothing. Flink guarantees the order by partition, you will receive messages in order.

    enter image description here

    More details at: https://www.ververica.com/blog/kafka-flink-a-practical-how-to