Let's say I have 8 task managers with 16 task slots. If I submit a job using the Jobmanager UI and set the parallelism to 8, do I only utilise 8 task slots?
What if I have 8 task managers with 8 slots, and submit the same job with a parallelism of 8? Is it exactly the same thing? Or is there a difference in the way the data is processed?
Thank you.
The total number of task slots in a Flink cluster defines the maximum parallelism, but the number of slots used may exceed the actual parallelism. Consider, for example, this job:
If run with parallelism of two in a cluster with 2 task managers, each offering 3 slots, the scheduler will use 5 task slots, like this:
However, if the base parallelism is increased to six, then the scheduler will do this (note that the sink remains at a parallelism of one in this example):
See Flink's Distributed Runtime Environment for more information.