Search code examples
apache-flinkflink-streaming

Flink task slots are not evenly distributed when setting operator parallelism larger than default parallelism


I'm running a Flink job on a cluster containing 3 task managers (on top of 3 Kubernetes pods). Job's default parallelism is 9 and one of the operators is set to parallelism 18. Job's number of task slot is set to 18 (the largest parallelism value).

I observe the following behavior:

The operator set to parallelism 18 is equally distributed between all task slots.

All other operators (set to default - 9) are not distributed equally. For example:

  • TM1: running 2 sub-tasks
  • TM2: running 5 sub-tasks
  • TM3: running 2 sub-tasks

Can someone please explain the following -

  • What causes this uneven distribution?
  • Can I control operator assignment to be ballanced? how can I do it?

(Running with Flink v1.6.3)


Solution

  • At the moment, Flink does not support to control how tasks are spread across different TaskManagers. Flink assumes all slots to be equal and, therefore, does not try to spread out tasks uniformly. The community wants to add this functionality, though. Here is the respective issue.

    Update

    The problem has been fixed for Flink >= 1.9.2. In order to enable spreading out of tasks, you must configure cluster.evenly-spread-out-slots: true in your flink-conf.yaml.