I have a cluster of 3
machines with 4
cores each. Each machine has one task manager. I know that the number of slots in Flink can be controlled by taskmanager.numberOfTaskSlots
. I initially had allotted 12
slots in total (every task manager had 4
slots). Although, there is no explicit CPU isolation among slots (as mentioned here), I assume that each slot is roughly using 1 core. Am I right in assuming this?
I haven't mentioned any slot sharing group in my code and my pipeline does not have any blocking edges. The parallelism of each task is the same and is equal to the number of slots. I am assuming that one subtask from each task will be in a slot. Am I correct in this understanding?
After some conversation (link for the curious minds :-)), I wanted to increase the cores per slot to 2
for my experiments. So, I reduced the taskmanager.numberOfTaskSlots
to 2
on each machine? After doing this, I see that the Flink WebUI shows 6
slots is total and 2
slots for each task manager. I have also reduced the parallelism of each task to 6
. Is this all that I need to do?
Note: I am not using the MVP feature of fine grained resource management right now.
That sounds right.
Each Task Manager is a single JVM. A task slot doesn't correspond to anything physical -- it's just an abstract resource managed by the Flink scheduler. Each task in a task slot is an instance of an operator chain in the execution graph, and each task is single-threaded. No two instances of the same operator chain will ever be scheduled into the same slot.
All of the threads for all of the tasks in all of the slots in given task manager will compete for the resources available to that JVM: cores, memory, etc.
As you have noted, there is no way to explicitly set the number of cores per slot. And there's no requirement that the number be an integer. You could, for example, decide that your 4-core TMs are each providing 3 slots, for a total parallelism of 9 across the 3 TMs.