Search code examples
architectureapache-flink

How is a job assigned to TaskManager in Apache Flink


Well, basically I know the whole process of the assignment of job:

  1. Generate StreamGraph at JobClient;
  2. Generate JobGraph at JobClient;
  3. Send JobGraph to JobMaster to generate ExecutionGraph;
  4. Send ExecutionGraph to TaskManager to generate Physical Execution Plan.

But I don't know what kind of thing is sent exactly. For example, when the JobMaster sends ExecutionGraph to some TaskManager, what is exactly the ExecutionGraph? Is it a file? some kind of stream? or something else?

If it is a file, what if the file is too huge? Is there some limitation of size?

If it is a stream, what if sending the stream spends too much time? Is there some limitation of timeout?


Solution

  • Job Manager does not send the execution graph. In my opinion, the Execution graph is the logical view of the execution of the tasks (i.e. parallel instances of an operator-chain aka Task, slot-sharing group of tasks).

    JobManager searches for available task-slots to allocate to slot-sharing groups. The tasks in those slot-sharing group are sent to the Task Managers that have the respective slots. Afterward, those tasks run in the slot in different threads.

    Moreover, Any kind of messaging between the Job Manager and Task Managers is done by Akka.