Search code examples
apache-flink

Why should user have to set parallelism explicitly


I kicked off a flink application with n TaskManagers and s slots for each TaskManager, so that, My application will have n*s slots.

That means, flink could be able to run n*s subtasks at most at the same time. But why flink doesn't try to use most resources to run as many subtasks as possible, and bother end users to set the parallelism explicitly?

For the flink beginners that don't know the parallelism setting(default is 1), it will always run only one subtask even given more resources!

I would like to know the design considerations here, thanks!


Solution

  • A Flink cluster can also be used by multiple users or a single user can run multiple jobs on a cluster. Such clusters are not sized to run a single job but to run multiple jobs. In such environments its not desirable if jobs grab all available resources by default.