Search code examples
apache-flink

What is a slot in a Flink Task Manager?


In Apache Flink system architecture, we have concepts of Client process, master process (JobManager), worker processes (TaskManager).

Every process above is basically a JVM process. TaskManager executes individual tasks, with each task being execute in a thread. So this manager-to-process or a task-to-thread mapping is clear.

What about slots in TaskManager? What is a slot mapped to?


Solution

  • Task slots in Flink are the primary unit of resource management and scheduling.

    When the Dispatcher (part of the Flink Master) receives a job to be executed, it looks at the job's execution graph to see how many slots will be needed to execute it, and requests that many slots from the Resource Manager. The Resource Manager will then do what it can to obtain those slots (there is a Yarn Resource Manager, a Kubernetes Resource Manager, etc.). For example, the Kubernetes Resource Manager will start new Task Manager pods as needed to create more slots.

    Each Task Manager is configured with some amount of memory, and some number of CPU cores, and with some number of slots it offers for executing tasks. Those slots share the resources available to the Task Manager.

    Typically a slot will be assigned the tasks from one parallel slice of the job, and the number of slots required to execute a job is typically the same as the degree of parallelism of the task with the highest parallelism. I say "typically" because if you disable slot sharing (slot sharing allows multiple tasks to share the same slot), then more slots will be required -- but there's almost never a good reason to disable slot sharing.

    The figure below shows the execution graph for a simple job, where the source, map, and window operators have a parallelism of two, and the sink has a parallelism of one. The source and map have been chained together into a single task, so this execution graph contains a total of 5 tasks that need to be assigned to task slots.

    execution graph

    This next figure shows two TMs, each with one slot, and you can see how the scheduler has assigned the 5 tasks across these 2 slots.

    task slots